Manipulating Arrays
Now that you've learned the basic rules for building arrays, it's time to learn some tools to help you manipulate those arrays to perform useful tasks.
Stepping Through an Array
In Hour 3, "Controlling the Program's Flow," you learned about making loops with while, for, and other constructs. Many tasks you'll want to perform involve examining each element of an array. This process is called iterating over the array. One way you could do so is to use a for loop, as follows:
@flavors=qw( chocolate vanilla strawberry mint sherbet ); for($index=0; $index<@flavors; $index++) { print "My favorite flavor is $flavors[$index] and..." } print "many others.\n";
The first line initializes the array with ice cream flavors, using the qw operator for clarity. (If I had included a two-word flavor such as Rocky Road, I would have needed a regular, single-quoted list.) The second line does most of the work. $index is initialized to 0 and incremented by 1 until @flavors is reached. Because it is being compared to the scalar $index, @flavors is evaluated in a scalar context, so it evaluates to 5the number of elements in @flavors.
The preceding example seems like an awful lot of work for just iterating over an array. Usually in Perl, if something seems like an awful lot of work, you can find an easier way to do it. This is no exception. Perl has another loop statement that wasn't mentioned in Hour 3, called foreach. The foreach statement sets an index variable, called an iterator, equal to each element of a list in turn. Consider this example:
foreach $cone (@flavors) { print "I'd like a cone of $cone\n"; }
Here, the variable $cone is set to each value in @flavors. As $cone is set to each value in @flavors, the body of the loop is executed, printing the message for each value in @flavors.
In a foreach loop, the iterator isn't just a variable that is assigned the value of each element in the list; it actually refers to that list element itself. If you modify the iterator, the corresponding element in the list will remain modified after the loop is done. Check out this example:
foreach $flavor (@flavors) { print "I'd like a bowl of $flavor ice cream, please.\n"; $flavor = "$flavor (I've had some)"; } print "The available flavors are\n"; foreach $flavor (@flavors) { print "$flavor\n"; }
In the first loop, the second line prints "I'd like a bowl of chocolate ice cream, please." continuing with vanilla, strawberry, and so on. The third line, however, modifies $flavor, and therefore the corresponding element of @flavors, by appending (I've had some) on the end. After the first loop finishes, the second loop lists the flavors, showing for each one the fact that I've had some.
NOTE
In Perl, the foreach and for loop statements are actually synonyms; they can be used interchangeably. For clarity, throughout this book, you'll find that I use the foreach() loop statement to iterate over arrays and the for() loop statement for the kind of for loops presented in Hour 3, which did not involve arrays. Keep in mind that they are interchangeable.
Converting Between Arrays and Scalars
Perl doesn't have one general rule about converting between scalars and arrays. Rather, Perl provides many functions and operators for converting between the two types.
One method to convert a scalar into an array is the split function. The split function takes a pattern and a scalar, uses the pattern to split the scalar apart, and returns a list of the pieces. The first argument is the pattern (here surrounded by slashes), and the second argument is the scalar to split apart:
@words=split(/ /, "The quick brown fox");
After you run this code, @words contains each of the words The, quick, brown, and foxwithout the spaces. If you don't specify a second argument, the variable $_ is split. If you don't specify a pattern or a string, whitespace is used to split apart the variable $_. One special pattern, // (the null pattern), splits apart the scalar into individual characters, as shown here:
while(<STDIN>) { ($firstchar)=split(//, $_); print "The first character was $firstchar\n"; }
The first line reads from the terminal one line at a time, setting $_ equal to that line. The second line splits $_ apart using the null pattern. The split function returns a list of each character from the line in $_. That list is assigned to the list on the left side, and the first element of the list is assigned to $firstchar; the rest are discarded.
NOTE
The patterns used by split are actually regular expressions. Regular expressions are a complex pattern-matching language introduced in Hour 6, "Pattern Matching." For now, the examples will use simple patterns such as spaces, colons, commas, and such. After you've learned about regular expressions, I will give examples that use more complex patterns to pull apart scalars with split.
This method of splitting a scalar into a list of scalar variables is common in Perl. When you're splitting apart a scalar in which each piece is a distinct elementsuch as fields in a recordit's easier to figure out which piece is what when you name each piece as it's split. Observe the following:
@Music=('White Album,Beatles', 'Graceland,Paul Simon', 'A Boy Named Sue,Goo Goo Dolls'); foreach $record (@Music) { ($record_name, $artist)=split(',', $record); }
When you split directly into a list with named scalars, you can clearly see which fields represent what. The first field is a record name, and the second field is the artist. Had the code split into an array, the distinction between fields might not have been as clear.
To create scalars out of arraysthe reverse of splityou can use the Perl join function. join takes a string and a list, joins the elements of the list together using the string as a separator, and then returns the resulting string. Consider this example:
$numbers=join(',', (1..10));
This example assigns the string 1,2,3,4,5,6,7,8,9,10 to $numbers.
In Perl the output (return value) of one function can be used as an input value (argument) in another function. You can use split and join to pull a string apart and put it back together all at the same time, as seen here:
$message="Elvis was here"; print "The string \"$message\" consists of:", join('-', split(//, $message));
In this example, the $message is split into a list by split. That list is used by the join function and put back together with dashes. The result is the following message:
The string "Elvis was here" consists of: E-l-v-i-s- -w-a-s- -h-e-r-e
Reordering Your Array
When you're building arrays, often you might want them to come out in a different order than you built them. For example, if your Perl program reads a list of customers in from a file, printing that customer list in alphabetical order would be reasonable. For sorting data, Perl provides the sort function. The sort function takes as its argument a list and sorts it in (roughly speaking) alphabetical order; the function then returns a new list in sorted order. The original array remains untouched, as you can see in this example:
@Chiefs=qw(Clinton Bush Reagan Carter Ford Nixon); print join(' ', sort @Chiefs), "\n"; print join(' ', @Chiefs), "\n";
This example prints the sorted list of presidents (Bush Carter Clinton Ford Nixon Reagan) and then prints the list again in its original order.
Be forewarned that the default sort order is ASCII order. This means that all words that start with uppercase characters sort before words that begin in lowercase letters. Numbers do not sort in ASCII order the way you would expect. They don't sort by value. For example, 11 sorts higher than 100. In cases like this, you need to sort by something other than the default order.
The sort function allows you to sort in whatever order you want by using a block of code (or a subroutine name, discussed in Hour 8, "Functions") as the second argument. Inside the block (or subroutine), two variables, $a and $b, are set to two elements of the list. The block's task is to return 1, 0, or 1 depending on whether the $a is less than $b, equal to $b, or greater than $b, respectively. The following is an example of the hard way to do a numeric sort, assuming that @numbers is full of numeric values:
@sorted=sort { return(1) if ($a>$b); return(0) if ($a==$b);The preceding example certainly sorts @numbers numerically. But the code looks far too complicated for such a common task. As you might suspect for anything this cumbersome, Perl has a shortcut: the ÒspaceshipÓ operator, <=>.The spaceship operator gets its name because it somewhat resembles a flying saucer, seen from the side. It returns 1 if its left operand is less than the right, 0 if the two operands are equal, and 1 if the left operand is greater than the right:return(-1) if ($a $b); } @numbers;
@sorted=sort { $a<=>$b; } @numbers;
This code is much cleaner, easier to look at, and more straightforward. You should use the spaceship operator only to compare numeric values.
To compare alphabetic strings, use the cmp operator, which works exactly the same way. You can put together more complex sorting arrangements by simply making a more sophisticated sort routine. Section 4 of the Perl Frequently Asked Questions (FAQ) has some more sophisticated examples of this if you need them.
The final function for this hour is an easy function, reverse. The reverse function, when given a scalar value in a scalar context, reverses the string's characters and returns the reversed string. The call reverse("Perl") in a scalar context, for example, returns lreP. When given a list in a list context, reverse returns the elements of the list in reverse order, as in this example:
@lines=qw(I do not like green eggs and ham); print join(' ', reverse @lines);
This snippet prints ham and eggs green like not do I. To continue this playfulness and really show off the function-stacking capability, you can add more nonsense to the mixture:
print join(' ', reverse sort @lines);
The sort is run first, producing the Yoda-esque list (I,and,do,eggs,green,ham,like,not). That list is reversed and passed to join for joining together with a space. The result is not like ham green eggs do and I. I couldn't agree more.