Stacking Perl's Building Blocks: Lists and Arrays
- Putting Things into Lists and Arrays
- Getting Elements Out of an Array
- Manipulating Arrays
- Exercise: Playing a Little Game
- Summary
- Q&A
- Workshop
See all Sams Teach Yourself on InformIT Programming Tutorials.
Scalars are Perl's singular nouns. They can represent any one thinga word, a record, a document, a line of text, or a character. Often, though, you need to talk about collections of thingsmany words, a few records, two documents, fifty lines of text, or a dozen characters.
When you need to talk about many things in Perl, you use list data. You can represent list data in three different ways: by using lists, arrays, and hashes.
Lists are the simplest representation of list data. A list is simply a group of scalars. Sometimes they're written with a set of parentheses encasing the scalars, which are separated by commas. For example, (2, 5, $a, "Bob") is a list that contains two numbers, a scalar variable $a, and the string "Bob". Each scalar in a list is called a list element. In keeping with the philosophy of Least Surprise (see Hour 2, "Perl's Building Blocks: Numbers and Strings"), Perl's lists can contain as many scalar elements as you like. Because scalars can also be arbitrarily large, a list can hold quite a lot of data.
To store list data so that you can refer to it throughout your program, you need an array variable. Array variables are represented in Perl with an "at" sign (@) as the type identifier followed by a valid variable name (as discussed in Hour 2, "Perl's Building Blocks: Numbers and Strings"). For example, @foo is a valid array variable in Perl. You can have the same name for an array variable as a scalar variable; for example, $names and @names refer to different things$names to a scalar variable, and @names to an array. The two variables have nothing to do with each other.
Individual items in an array are called array elements. Individual array elements are referred to by their position within the array, called an index. That is, we can refer to the third array element of the array @foo, the fifth array element of the array @names, and so on.
The other list type, a hash, is similar to an array. Hashes will be discussed further in Hour 7, "Hashes."
In this hour you will learn
-
How to fill and empty arrays
-
How to examine arrays element by element
-
How to sort and print arrays
-
How to split scalars into arrays and join arrays back into scalars
Putting Things into Lists and Arrays
Putting things into a literal list is easy. As you just saw, the syntax for a literal list is a set of parentheses enclosing scalar values. The following is an example:
(5, 'apple', $x, 3.14159)
This example creates a four-element list containing the numbers 5, the string 'apple', whatever happens to be in the scalar variable $x, and pi.
If the list contains only simple strings, and putting single quotation marks around each string gets to be too much for you, Perl provides a shortcutthe qw operator. An example of qw follows:
qw( apples oranges 45.6 $x )
This example creates a four-element list. Each element of the list is separated from the others by whitespace (spaces, tabs, or newlines). If you have list elements that have embedded whitespace, you cannot use the qw operator. This code works just as though you had written the following:
('apples', 'oranges', '45.6', '$x')
Notice that the $x is encased in single quotation marks. The qw operator does not do variable interpolation on elements that look like variables; they are treated as though you wanted them that way literally. So '$x' is not converted to whatever the value of the scalar variable $x is; it's left alone as a string containing a dollar sign and the letter x.
Perl also has a useful operator that works in literal lists; it's called the range operator. The range operator is designated by a pair of periods (..). The following is an example of this operator:
(1..10)
The range operator takes the left operand (the 1) and the right operand (the 10) and constructs a list consisting of all the numbers between 1 and 10, inclusive. If you need several ranges in a list, you can simply use multiple operators:
(1..10, 20..30);
The preceding example creates a list of 21 elements: 1 through 10 and 20 through 30. Giving the range operator a right operand less than the left, such as (10..1), produces an empty list.
The range operator works on strings as well as numbers. The range (a..z) generates a list of all 26 lowercase letters. The range (aa..zz) generates a much larger list of 676 letter pairs starting with aa, ab, ac, ad and ending with zx, zy, zz.
Arrays
Literal lists are usually used to initialize some other structure: an array or a hash. To create an array in Perl, you can simply put something into it. With Perl, unlike other languages, you don't have to tell it ahead of time that you're creating an array or how large the array is going to be. To create a new array and populate it with a list of items, you could do the following:
@boys=qw( Greg Peter Bobby );
This example, called an array assignment, uses the array assignment operatorthe equals sign, just as in a scalar assignment. After that code runs, the array @boys contains three elements: Greg, Peter, and Bobby. Notice also that the code uses the qw operator; using this operator saves you from having to type six quotation marks and two commas.
Array assignments can also involve other arrays or even empty lists, as shown in the following examples:
@copy=@original; @clean=();
Here, all the elements of @original are copied into a new array called @copy. If @copy already had elements before the assignment, they are now lost. After the second statement is executed, @clean is empty. Assigning an empty list (or an empty array) to an array variable removes all the elements from the array.
If a literal list contains other lists, arrays, or hashes, these lists are all flattened into one large list. Observe this snippet of code:
@boys=qw( Greg Peter Bobby ); @girls=qw( Marcia Jan Cindy ); @kids=(@girls, @boys); @family=(@kids, ('Mike', 'Carol'), 'Alice');
The list (@girls, @boys) is flattened by Perl to a simple list containing first all the girls' names and then all the boys' names before the values are assigned to @kids. On the next line, the array @kids is flattened, and the list ('Mike', 'Carol') is flattened into one long list; then that list is assigned to @family. The original structures of @boys, @girls, @kids, and the list ('Mike', 'Carol') are not preserved in @familyonly the individual elements from Greg through Alice. In other words, the preceding snippet for building @family is equivalent to this assignment:
@family=qw(Marcia Jan Cindy Greg Peter Bobby Mike Carol Alice );The left side of an array assignment can be a list if it contains only variable names. The array assignment initializes the variables on that list. Consider this example:
($a, $b, $c)=qw(apples oranges bananas);
Here, $a is initialized to 'apples', $b to 'oranges', and $c to 'bananas'.
If the list on the left contains an array, that array receives all the remaining values from the right side, no matter where it is in the list. The reason is that an array can contain an indefinite number of elements. Observe here:
In this example, $a is set to 'peaches'. The remaining fruits in the list on the right are assigned to @fruit on the left. No elements are left for $c to receive a value (because the array on the left side of an assignment absorbs all the remaining values from the right), so $c is set to undef.
It's also important to note that if the left side contains more variables than it has elements, the leftover variables receive the value undef. If the right side has more variables than the list on the left has elements, the extra elements on the right are simply ignored. The following figure shows another example to help understand that concept.
In the first line, $t, $u, and $v all receive a value from the right side. The extra right-side element ('quail') is simply not used for this expression. In the second line, $a, $b, and $c all receive a value from the right. $d, however, has nothing to get from the right ($c takes the last value, 'gopher'), so $d is set to undef.