Don’t blame your misunderstanding of precedence

Ovid blames Perl for a newbie-level mistake in a Twitter post.
Marcel goes on to misread the problem as one of context. Ovid posts a fix that uses parentheses to solve the precedence problem.

This is chiefly a problem of code reading skills. Here’s the code:

$ perl -MData::Dumper -e 'sub boo { 1,2,3 } my @x = (boo()||5,8,7); print Dumper \@x'

The output is:

$VAR1 = [
          3,
          8,
          7
        ];

Ovid doesn’t say much about what he thinks should happen, but it shouldn’t be hard for any person who actually knows Perl (and programming), to figure it out. If you understand precedence, you know which order things will occur.

First, you have to break it down into what happens when. Many experienced people often skip this step because they think their experience should allow them to skip the basics. They try to take in complex expressions all at once and figure out what they do. That’s where people get confused. They ignore the few simple rules of code reading.

Perl figures out the right side of the assignment operator first, so you have to figure out this expression, which is in list context because of the assignment to @x, an array:

(boo()||5,8,7)

In list context, the comma operator separates the elements of the list. There are only a few operators lower in precedence, and || isn’t one of them. The list is then going to be the results of these three expressions:

boo()||5
8
7

This is not what some people (maybe Ovid) did not expect because they parsed it as the choice between two lists:

boo()
(5,8,7)

Most of the misunderstanding comes from thinking the comma is just a way to separate items instead of thinking about it as an operator that has precedence like other operators. Change the comma to a different, unfamiliar character, such as ‡, and show it to a programmer who understands precedence and I assert the confusion disappears because the programmer doesn’t insert his misconceptions about the comma into reading the code:

bar() || 5 ‡ 8 ‡ 7

Once you understand the precedence, the last two expressions of the list, 8 and 7, are easy. The first one is easy too. If you looked at it by itself you shouldn’t have a problem with it. You call boo() in scalar context because || is a scalar argument. If it returns a true value, use it. Otherwise, use 5. That’s easy enough too. You might recognize the process better if you saw it with a different scalar operator, such as +:

bar() + 5

The definition for the subroutine is just boo { 1,2,3 }. In scalar context, that is the final element in the series because that’s what the comma operator in scalar context does. Two things are lacking here in most people’s analysis: the comma is an operator and it responds to context. This is almost excusable as a gotcha, but it’s such a well known gotcha that a practicing Perl programmer should know it. This isn’t some obscure corner of the language. It’s the very basics of how the language works.

The perlop documentation is quite clear. The comma operator in scalar context evaluates its left expression and discards it (so, there may be side effects), then evaluates its rightmost element, in this case three, and returns it. There’s even an example in the perlfaq4’s “What’s the difference between a list and an array” that tells you exactly that, and it does that because so many people make this same mistake.

Experienced programmers often charge ahead where they’d do well to read the documentation. It doesn’t matter what you think it should do; it only matters what it does. Intuition is fine when it works out, but it’s not an excuse for a lack of knowledge or education. Intuition is a fool’s game; it only has hope when everyone thinks the same, and nobody does.

Leave a comment

0 Comments.

Leave a Reply

You must be logged in to post a comment.

7ads6x98y