While Perl may not be as popular as it had once been, many of the dynamic languages and web technologies that power the web today—such as JavaScript, PHP, Python, and Ruby—were all influenced by Perl. You can still see the design decisions that were made in Perl 30 years ago present in those languages today.
First released in the late 1980s by creator Larry Wall, Perl was in the right place at the right time and quickly became a major player in the development of the internet. As one of the first dynamically typed languages, it gave developers the tools to quickly do things that couldn’t be done before. No longer would they need to pre-allocate memory or keep track of which data type a variable stored.
Perl still remains one of the best programming languages for text processing with regular expressions, which makes it very easy to use for processing file input and output. Perl is also commonly used in system administration, web services, database design using MySQL and Oracle DBA, open-source and shell scripting projects. Many developers choose Perl instead of ASP.NET because it has a lot of libraries, it’s open-sourced and is powerful. Also, Perl fans use it instead of Bash and Unix Shell, since you can write both short programs or huge distributed applications there, while in BASH or other Unix shells writing huge applications is rather difficult.
If you are looking to hire Perl developers, it’s probably because you have a legacy Perl application or you’re looking to take advantage of the power of its regular expressions engine. This hiring guide covers the basic, key aspects that every Perl programmer should know inside and out.
References
As mentioned before, Perl programming brought on a major shift in programming and heralded an evolution from statically typed to dynamically typed languages. One of the other shifts it ushered in was the way references are used. The concept of pointers, common in predecessor languages like C and C++, were confusing to many developers, so Perl did away with pointers and instead introduced references, which simplified memory management for the developer.
References are used frequently and extensively in Perl code. They’re very important for a Perl web developer to understand, as the syntax of element access changes depending on whether you have a reference or direct access.
Q: In Perl, how do you initialize the following?
-
an array
-
an array reference
-
A hash
-
A hash reference
Furthermore, how would you change an array to an array reference, a hash to a hash reference, and vice versa? How do you access elements from within these variables?
A: The use of hash and array references is a pretty basic concept for any experienced Perl developer, but it may syntactically trip up some newer Perl developers or developers who never really grasped the underlying basics.
Initializing an Array:
my @arr = (0, 1, 2);
An array is initialized with an @
symbol prefixed to the variable name, which denotes the variable type as an array; its elements are placed in parentheses.
Initializing an Array Reference:
my $arr_ref = [0, 1, 2];
With an array reference, you use the $
symbol, which denotes ‘scalar’, and the elements are placed in square brackets. The reference isn’t specified as an array, just as a scalar, so you have to be careful to handle the variable type appropriately.
With hashes, the syntax is similar.
Initializing a Hash:
my %hash = (0 => 'First', 1 => 'Second', 2 => 'Third');
Just as with an array, the elements of a hash are defined with parentheses, but since the variable is a hash, it’s prefixed with a %
.
Initializing an Array Reference:
my $hash_ref = {0 => 'First', 1 => 'Second', 2 => 'Third'};
Like an array reference, a hash reference variable is prefixed with a $
, but the elements are placed in curly braces.
Referencing a Hash or an Array
Referencing an array or hash is pretty straightforward. In Perl, a backslash in front of a variable will return the reference to it. You should expect something like the following:
my $arr_ref = \@arr;
my $hash_ref = \%hash;
Dereferencing
Dereferencing a referenced variable is as easy as reassigning it with the appropriate variable identifier. For example, here’s how you would dereference arrays and hashes:
my @arr = @$arr_ref;
my %hash = %$hash_ref;
Accessing Elements
The differences between accessing elements of these variable types and their reference versions is another area where amateur developers may get tripped up.
# to access an element of an array
my $element = $arr[0];
Notice that for an array you are not using the @
prefix but rather the $
to denote a scalar, which is the type returned when accessing any element of an array. Accessing the elements of an array reference, a hash, and a hash reference follows a similar syntax:
# to access an element of an array reference
my $element = ${$array_ref}[0];
# to access an element of a hash
my $element = $hash{0};
# to access an element of a hash reference
my $element = $hash_ref->{0};
Special Variables
One of the things unique to Perl is the number of special variables it provides. While this can make Perl code very concise, it also makes it rather cryptic to new developers. While only those with expert Perl knowledge will know most (or all) of the special variables, there are some key ones that every Perl developer, regardless of skill level, should be familiar with.
Q: Using $_
: Verbally explain the functionality of the following example code snippet:
my @new = map { $_ + 1 } @values;
A: The map
function will loop through each element in the @values
array and $_
will be set to the element of each iteration. This is equivalent to the following more common and verbose code:
my @new = ();
foreach (@values) {
push(@new, $_ + 1);
}
or
my @new = ();
foreach my $value (@values) {
push(@new, $value + 1);
}
Q: Using @_
: Within the following routine, explain the value of @_
:
sub my_subroutine {}
A: @_
will be set to any parameters that are passed into the subroutine.
So, for example, if the subroutine is called as follows:
my_subroutine(1, 'string', 2);
…then @_
will be an array containing the elements (1, 'string', 2)
.
Regular Expressions
Perl provides a powerful and easy way to work with regular expressions. Even if developers are not doing text processing, they will no doubt come across situations in Perl where regular expressions are the best fit for the job.
Q: Explain what the following code does, in detail:
$str =~ s/-//g;
A: It removes all hyphens (-
characters) from the string.
Here we have the variable $str
which contains a string.
The =~
is the Perl operator for performing a regular expression.
The s
on the left side indicates that we are going to perform a substitution.
The slashes after the s
respectively demarcate the regular expression pattern to match and its replacement. After the slashes come any optional modifiers. Let’s take it apart piece by piece:
In this example, -
is matching all hyphens.
The second set of slashes, //
, is empty, so the matching hyphens in the first part are being replaced by nothing (i.e., deleted.)
The g
is a modifier telling the regular expression engine to execute this globally on the string; without the g
modifier, only the first hyphen would be removed.
Q: Write a script that takes a list of file names as command-line arguments. Its processing will take these log files of errors and count how many errors occurred on specific days.
For any line that starts with a timestamp in the format of YYYY-MM-DD, increment the counter for that day and print a summary, in ascending date order, like the following:
Example output:
2016-06-01: 3
2016-06-02: 4
2016-06-04: 1
Days that don’t appear in the log file do not need to appear in the output.
A: One concise answer would look like the following:
my %counts = ();
while (<>) {
if (/^(\d{4}-\d{2}-\d{2})/) {
$counts{$1}++;
}
}
for ( sort { $a cmp $b } keys %counts) {
print "$_: $counts{$_}\n";
}
The diamond operator (<>
) in the while loop is another example of a special operator in Perl. It will loop through the @ARGV
array, which are the arguments passed into the script (in this case the file names), open the files, and read through the lines.
The if
statement contains a regular expression which is implicitly checking against the $_
special variable, and the while loop sets $_
to each line as it loops through. In Perl, the $_
can usually be inferred and this is one such case.
The regular expression itself is doing the following:
The caret ^
is the character used to represent the start of a line, meaning that the match must begin with the start of a line.
When parentheses are used in a regular expression, the part of the string matched by the part of the regular expression inside the parentheses will be “captured” (i.e., stored in a temporary variable) if a match occurs. There can be multiple parentheses in a regular expression. The part of the string matching the first set of parentheses will be stored in the temporary variable $1
, the second will be stored in $2
, and so on. In the above example, there is only one set of parentheses, so $1
will be set to that match.
The match we’re checking for, \d{4}-\d{2}-\d{2}
, consists of 4 digits followed by a dash, followed by 2 digits, followed by a dash, finally followed by 2 digits.
If we find that match, then we increment the %counts
hash by one, where the key is the date that was matched. We don’t need to initialize the values the first time a key is found because Perl automatically sets the value to 0, so we can simply use the ++
operator to increment the counter for the date.
In Perl, you don't have to set a value the first time you use a key in a hash; Perl automatically sets this value to 0, so you can use the `++` operator to increment.
In the for
statement, we are once again making use of the $_
special variable by not explicitly naming the value for each loop. Starting on the right side:
-
The keys %counts
statement returns an array of the keys from the hash. These will be the dates that we previously encountered.
-
The sort
function goes through the key array and uses the string comparison function cmp
to sort the array in ascending order.
-
The resulting array is then used in the for
loop where the output is printed out with the date, followed by a colon. Then we access the value from the %counts
hash for the date (the number of times an error was logged), and append that, followed by a new line.
Common Functions
As with most modern programming languages, hashes and arrays are a big part of programming in Perl. Software developers may find data stored in one structure or the other, but may need to retrieve it in a specific format. Perl provides very concise and powerful functions to do this retrieval; functions with which experienced Perl developers must be familiar.
These questions are designed to test just how comfortable the candidate is with common Perl functions. Qualified candidates should have an understanding of common functions like map
and grep
. If the candidate solves the question using one function, ask them to solve it using the other function.
A junior-level programmer might write code that creates a separate array and then uses a `for` loop to add the value to a hash.
Q: Given an array, how would you get an array of just the unique elements?
A: The standard technique is as follows:
my %uniq_hash = map { $_ => 1 } @input;
my @uniq = keys %uniq_hash;
Here you are taking the @input
array, and using map
to create a hash with its keys set to the values of @input
. Since a hash only allows for unique keys, there will be no duplicates. To get an array back, you just use the keys
function to get the keys of the hash in array form.
Q: Write some code that prints ‘yes’ if the value 3
is in an array.
A: You could use the same procedure as above and then simply do:
my %uniq_hash = map { $_ => 1 } @input;
if ($uniq_hash{3}) { print 'yes'; }
This just checks the hash you created to see if there is a key set to 3 and, if there is, prints “yes”.
You could achieve the same effect with grep
, which loops through the @input
array and tests each value against your statement, as follows:
if (grep $_ eq 3, @input) { print 'yes'; }
While either approach will work, a developer would be better off using the first technique if multiple values need to be checked in order to avoid looping through the array multiple times. For a one-off check, though, this would not be an issue, in which case both ways would normally be acceptable.
A: While the answer is a single line, this question tests the candidate’s ability to understand the split
function and how arrays are converted into hashes in Perl:
my %hash = split /[=&]/, $str;
The function split
takes the input string and splits it into an array wherever the match occurs (in this case either an equal sign or ampersand). In this case, the resulting array would be [key1, value1, key2, value2, key3, value3]
.
Perl converts arrays to hashes by taking the first element as the key and second element as the value, then the third as the key and fourth as the value and so on.
Storing the result of the split function into a hash type, you will get the following:
{
key1 => value1,
key2 => value2,
key3 => value3
}
There’s More Than One Way to Do It
Perl prides itself on providing multiple ways to accomplish the same software development, web development, or other programming tasks such as within Google’s mobile operating system, Android, on which Perl 5 can be installed. In front-end projects, Perl is well known by facilitating the use of scripts to create websites or parts of them to embed them in WordPress or HTML, CSS, Node.js. As such, keep an open mind about a candidate’s answers. Try to understand the reasoning behind their approach.
Some senior candidates in the United States or abroad may fully grasp the concise ways of programming and application development in Perl but choose to write out code in a more verbose way for easier readability. These developers might, for example, actively avoid using the special $_ variable because it takes time to figure out what it is actually referring to. They might therefore opt to use a named variable instead. The alternative might be a developer with a strong command of the language, but who writes code that is unreadable by junior-level developers.
By taking the time to understand the thinking behind the answers, you’ll gain a deep insight into the candidate’s thought process and find those that understand the long-term implications of the code that they write.