21 July 2017

Yeah, I know, sorry

I've been working! It has consumed all my runtime and then some more!

The good news is, I can at least use Perl 5 at work. And so I wrote a few little tidbits of information for my fellow developers, so they aren't surprised about bits of the language. It's almost blog-post worthy, so... copy-paste time!




"What is that? Why is that there? What does this do?" - This page is for YOU!
Perl 5 is an old language, and came before a lot of the syntax conventions that have acquired by other languages since. It is also a very expressive language, meaning that there are multiple ways to say the same thing, and some might be more obvious than others. This page attempts to collate some of the interesting features of the language that are super useful and make life fun, plus some of the quirks that might trip you up now and then.

Literal hashes are just lists of pairs

As you know, you can construct a hash variable in Perl 5 using parens and the => operator, like so:-
my %days = ( monday => 1, tuesday => 2, wednesday => 3, WEEKEND => 4);
And you're probably also familiar with the 'quote (splitting on whitespace)' operator, qw//. But did you know you can do this?
my %days = (qw/ monday 1 tuesday 2 wednesday 3 WEEKEND 4 /);
It constructs the same hash! You can check with re.pl. But why does this work? Shouldn't we be using => to pair the key with the value?
As it turns out, => is often called the fat comma operator. It works very much like a comma, delimiting lists of things, except that it quotes the bareword on its left-hand side. You could just as easily set up the hash with a plain old list,
my %days = ('monday', 1, 'tuesday', 2, 'wednesday', 3, 'WEEKEND', 4);
and you can also use => in places you're using an actual list of values - although it's nicer if you're using it to express some sort of relationship between the things on the left and right side of it.
One excellent use of this feature is passing named arguments to subroutines:-
sub DoTheThing {
  my ($thing, %do_what) = @_;
  ...
}

DoTheThing($thing, embiggen => 5, embolden => 1, cromulate => "anticlockwise");
The %do_what hash slurps all the remaining arguments after $thing, and enforces that there must be an even number of them to denote the pairs of argument name and value. Accessing a hash in list context gives you all those key-value pairs back, so you could conceivably use it to keep track of a bunch of default named arguments that you're frequently passing:-
my %default_args = (colour => "Blue", quest => "To find the Holy Grail");
DoTheThing($thing, %default_args, colour => "No wait, reAAAAARGGHH");


Subs don't declare their arguments in parens

Pretty much every language these days has you declare a function by including a list of arguments in parentheses after the name - most will even prevent you from declaring a function unless you include () even when you don't have any arguments to declare. Perl 5 does not do this. The usual method (lol) of declaring a subroutine looks something like this:-
sub ProcessStatementOfInformation {
  my ($propertyid, $soldid, $dbw) = @_;
This subroutine takes three arguments, and it takes them by pulling them out of the @_ array. You can still call the function with only two arguments, or none, and the missing values will be set to undefined. It's generally up to the subroutine to die or croak if some of the parameters it gets aren't what it expects.
However, you may have seen some subroutines declared with () on the end in some old code where the writer was a bit mad. In most cases this is an error, and likely introduced after too much time working in Javascript. What does it do, if not declare arguments?
Parentheses after a subroutine name declares a subroutine's prototype.
Prototypes in Perl 5 are old and deprecated. They are a short-hand notation for enforcing that a particular function takes an explicit number and type of parameter. For our example above, if we wished to explicitly state that our sub took exactly three parameters and they were all scalar variables, we would write
sub ProcessStatementOfInformation ($$$) {
  my ($propertyid, $soldid, $dbw) = @_;
and Perl will now emit compile-time errors if we call it with less than three or more than three arguments. Generally, prototypes shouldn't be used. They have a few interesting side-effects if you're writing a module that extents Perl's syntax, but for application code they are unnecessary.
In the case of the accidental (), any calls to the function will cause an error if they supply parameters - which they probably are. In some old code where the writer was a bit mad code, a work-around that is prevalent is the use of the ampersand calling convention. The & sigil in Perl represents a code reference, and allows one to do neat things like store a variable in a function, or return a function from some other function, and then call it. In the case of variables, you might write
my $coderef = GenerateFunction(with => "stuff", and => "things");
&$coderef(42);
In the case of some old code where the writer was a bit mad, if there were an accidental () after the sub ProcessStatementOfInformation, Perl would error out wherever it was called – unless the & sigil were used to call it as e.g. &ProcessStatementOfInformation($propid, $soldid, $dbhwrite);. This is saying to Perl "I know the symbol ProcessStatementOfInformation is a function, please call it like one" and in doing so, ignore any compile-time checks about the faulty prototype. The better solution is to remove the () after the sub name.

There's more than one way to count the number of items in a list

One quirky bit of syntax you might have seen around is $#, such as
$totalhomeopens=$#HomeOpens+1;
This undoubtedly looks a bit confusing at first, because # is normally for comments. For a given @list, the expression $#list returns the index of the last item in the list. This is why you'll often see 1 being added, as Perl (being a sensible language... looking at you, Lua) indexes array elements from 0. So you'll see loop code doing for ($i=0;$i<=$#options;$i++) and it's possible to access the last element of an array with something like $ARGV[$#ARGV].
However, you can also simply use the name of the list in scalar context to get the total number of items in the list.
$ my @apples = qw/Red_Delicious Golden_Delicious Granny_Smith Fuji Gala/;
$ say "I have ", scalar @apples, " apples.";
I have 5 apples.
$ say "If I had one more, I'd have ", @apples + 1, " apples.";
If I had one more, I'd have 6 apples.
$ say "The last apple in my list is a $apples[-1].";
The last apple in my list is a Gala.
Check out the nice trick at the end there - rather than having to figure out the index of the last element ourselves, we can just give Perl a negative value, and it will count back from the end of the list.

No comments:

Post a Comment