Working with Variables in Shell Scripts
- Working with Variables
- Environment and Shell Variables
- Summary
- Terms
See all Sams Teach Yourself on InformIT Programming Tutorials.
In many ways shell programming is similar to woodworking. A woodworking project requires a design for the project and its elements along with the right tools. In shell programming, the design is provided by the programmer and the tools are utilities or commands provided by UNIX and the shell. There are simply commands such as ls and cd. There are also commands such as awk and sed which are the power tools in UNIX.
The simple commands are easy to learn. You probably know how to use most of them. The power tools take longer to learn, but after you have mastered them, you can tackle almost any problem. This book covers both the simple tools and the power tools, with the main focus on the most powerful tool in UNIX, the shell.
The four chapters you will be reading are in the later part of the book. They provide you with a broad overview of the power of the shell and how its features can be harnessed to solve complex problems. The first two chapters show you how to use variables and functions, which can greatly simplify you scripts and make them easier to maintain and extend. The third and fourth chapters cover debugging and shell programming FAQs. They will help you to find and fix problems in your scripts.
—Sriranga Veeraraghavan
Variables are words that hold a value. The value can be any text string. The shell enables you to create, assign, and delete variables. Although the shell manages some variables, it is mostly up to the programmer to manage variables in shell scripts. By using variables, you can make your scripts flexible and maintainable.
In this chapter, we will examine the following topics:
-
Creating variables
-
Accessing variables
-
Array variables
-
Deleting variables
-
Environment variables
Working with Variables
Two types of variables can be used in shell programming:
Scalar variables
Array variables
Scalar variables can hold only one value at a time. Array variables can hold multiple values. This section explores the use of both types of variables.
Scalar Variables
Scalar variables are defined as follows:
name=value
Here name is the name of the variable, and value is the value that the variable should hold. For example,
FRUIT=peach
defines the variable FRUIT and assigns it the value peach.
NOTE
Scalar variables are often referred to as name-value pairs, because a variable's name and its value can be thought of as a pair.
Variable Names
The name of a variable can only contain letters (a to z or A to Z), numbers (0 to 9), and the underscore character (_). Furthermore, a variable's name can only start with a letter or an underscore. The following are examples of valid variable names:
_FRUIT FRUIT_BASKET TRUST_NO_1 TWO_TIMES_2
but
2_TIMES_2_EQUALS_4
is not a valid variable name. To make this a valid name, we would need to add an underscore at the beginning of its name:
_2_TIMES_2
Variable names that start with numbers, such as 1, 2, or 11, are reserved for use by the shell. You can use the value stored in these variables, but you cannot set the value yourself.
The reason you cannot use other characters such as !,*, or - is that these characters have a special meaning for the shell. If you try to create a variable name with one of these special characters, it confuses the shell. For example, the variable names
FRUIT-BASKET _2*2 TRUST_NO_1!
are invalid names. The error message generated for the first variable name will be similar to the following:
$ FRUIT-BASKET=apple /bin/sh: FRUIT-BASKET=apple: not found.
Variable Values
You can store or assign any value you want in a variable. For example,
FRUIT=peach FRUIT=2apples FRUIT=apple+pear+kiwi
A common error with variables is assigning values that contain spaces. For example, the following assignment
$ FRUIT=apple orange plum
results in this error message:
sh: orange: not found.
Values that have spaces in them need to be quoted. For example, both of the following are valid assignments:
$ FRUIT="apple orange plum" $ FRUIT='apple orange plum'
The difference between these two quoting schemes is covered in Chapter 10, "Quoting."
Accessing Values
You can access the value stored in a variable by prefixing its name with the dollar sign ($). When the shell sees a $, it performs the following actions:
Reads the next word to determine the name of the variable.
Retrieves the value for the variable. If a value isn't found, the shell uses the empty string "" as the value.
Replaces the $ and the name of the variable with the value of the variable.
This process, known as variable substitution, is covered in greater detail in Chapter 9, "Substitution." The following example demonstrates this process:
$ FRUIT=peach $ echo $FRUIT peach
In this example, the shell first determines that the variable FRUIT has been referenced. Next it looks up the value for FRUIT. Finally the string $FRUIT is replaced with peach, the value of FRUIT, which is what the echo command prints.
If you do not use the dollar sign ($), variable substitution is not performed and the name of the variable is used directly. For example,
$ echo FRUIT FRUIT
simply prints out FRUIT, not the value of the variable FRUIT.
The dollar sign ($) is used only when accessing a variable's value. It should not be used to define a variable or assign a value to a variable. For example, the assignment
$ $FRUIT=apple
generates the following error message
sh: peach=apple: not found
assuming that the value of FRUIT was peach. If the variable FRUIT did not have a value, the error would have been
sh: =apple: not found
Array Variables
Arrays are a method for grouping a set of variables together using a single name. Instead of creating a new name for each variable you need, you can use a single array variable to stores all the variables.
To understand how arrays work, consider the following example. Say that we are trying to represent the chapters in this book using a set of scalar variables. We could choose the following variable names to represent some of the chapters:
CH01 CH02 CH15 CH07
Each of these variable names has a specific format: the letters CH followed by the chapter number. This format serves as a way of grouping these variables together. An array variable formalizes this grouping by using an array name in conjunction with a number known as an index. The index is used to locate entries or elements in the array.
NOTE
Arrays are not available in Bourne shell. Arrays first appeared in Korn Shell, ksh, and were adapted by the Z Shell, zsh. Recent versions (2.0 and newer) of the Bourne Again Shell, bash, include support for arrays, but older versions do not. Several Linux distributions still ship with the older version of bash.
If you are using bash, the following command allows you to determine its version:
$ echo $BASH_VERSION
If the output starts with the string '1.' as follows:
1.14.7(1)
the version of bash you are using does not support arrays. The examples in this section will not work 1.x versions of bash.
If the output starts with the string '2.' as follows:
2.03.0(1)-release
the version of bash you are using supports arrays. The examples in this section will work with 2.0 and newer versions of bash.
Creating Array Variables
The simplest method of creating an array variable is to assign a value to one of its indices. This is expressed as follows:
name[index]=value
Here name is the name of the array, index is the index of the item in the array that you want to set, and value is the value you want to set for that item. In ksh, index must be an integer between 0 and 1,023. No such restriction is present in bash or zsh. The only restriction is that index must be an integer. It cannot be a floating point or decimal number, such as 10.3, or a string, such as apricot.
As an example, the following commands
$ FRUIT[0]=apple $ FRUIT[1]=banana $ FRUIT[2]=orange
set the values of the first three items in the array named FRUIT. You could do the same thing with scalar variables as follows:
$ FRUIT_0=apple $ FRUIT_1=banana $ FRUIT_2=orange
Although this works fine for small numbers of items, the array notation is much more efficient for large numbers of items. If you have to write a script using the Bourne shell only, you can use this method for simulating arrays.
In the previous example, the array indices were set in sequence. This is not necessary. For example, the following command sets the value of the item at index 10 in the FRUIT array:
$ FRUIT[10]=plum
The shell does not create a bunch of blank array items to fill in the space between index 2 and index 10; it just keeps track of those array indices that contain values.
If an array variable with the same name as a scalar variable is defined, the value of the scalar variable becomes the value of the element of the array at index 0. For example, if the following commands are executed
$ FRUIT=apple $ FRUIT[1]=peach
the zeroth element of FRUIT has the value apple. At this point, any accesses to the scalar variable FRUIT are treated as an access to the array item FRUIT[0].
The second form of array initialization can be used to set multiple elements at once. The syntax for this form of initialization differs between ksh and bash. In ksh, the syntax is as follows:
set A name value1 value2 ... valueN
In bash, the syntax is
name=(value1 ... valueN)
Either style can be used in zsh. Regardless of the style, name is the name of the array, and value1 to valueN are the values of the items to be set. When setting multiple elements at once, consecutive array indices, beginning at 0, are used.
For example the ksh command
$ set A band derri terry mike gene
or the bash command
$ band=(derri terry mike gene)
is equivalent to the following commands:
$ band[0]=derri $ band[1]=terry $ band[2]=mike $ band[3]=gene
TIP
When setting multiple array elements in bash, you can place an array index before the value:
myarray=([0]=derri [3]=gene [2]=mike [1]=terry)
The array indices don't have to be in order.
Accessing Array Values
An array variable can be accessed as follows:
${name[index]}
Here name is the name of the array, and index is the index of the desired element. For example, if the array FRUIT was initialized as in previous examples, the command
$ echo ${FRUIT[2]}
produces the following output:
orange
You can access all the items in an array in one of the following methods:
${name[*]} ${name[@]}
Here name is the name of the array you are interested in. If the FRUIT array is initialized as in previous examples, the command
$ echo ${FRUIT[*]}
produces the following output:
apple banana orange
If values of any of the array items contain spaces, this form of array access will not work; you will need to use the second form. The second form quotes all the array entries so that embedded spaces are preserved. For example, define the following array item:
FRUIT[3]="passion fruit"
Assuming that FRUIT is defined as in previous examples, accessing the entire array using the following command
$ echo ${FRUIT[*]}
results in five items, not four:
apple banana orange passion fruit
Commands accessing FRUIT using this form of array access get five values, with passion and fruit treated as separate items. To get only four items, you have to use the following form:
$ echo ${FRUIT[@]}
The output from this command looks similar to the previous commands:
apple banana orange passion fruit
but commands will see only four items because the shell quotes the last item, passion fruit, so it is treated as a single item.
Read-Only Variables
A read-only variable is a variable whose value cannot be changed after it is defined. Once a variable is specified as read-only, there is no way to get rid of it or to modify its value; it and its value persist until the shell exits.
Variables can be marked read-only using the readonly command. Consider the following set of commands:
$ FRUIT=kiwi $ readonly FRUIT $ echo $FRUIT kiwi $ FRUIT=cantaloupe
The last command results in an error message similar to the following:
/bin/sh: FRUIT: This variable is read only.
As you can see, we can read the value of the variable FRUIT, but we cannot overwrite the value stored in it.
This feature is often used in scripts to make sure that critical variables are not overwritten accidentally.
In ksh, bash, and zsh, readonly can be used to mark both array and scalar variables as read-only:
$ FRUITBASKET=(apple orange pear) $ readonly FRUITBASKET $ echo ${FRUITBASKET[1]} orange $ FRUITBASKET[1]=kiwi
The last command results in an error message similar to the following:
sh: FRUITBASKET[1]: is read onlyThis example used the bash style array assignment; if you are using ksh you will need to change the first command to the following:
$ set A FRUITBASKET apple orange pear
Unsetting Variables
Unsetting a variable tells the shell to remove the variable from the list of variables that it tracks. This is like asking the shell to forget a piece of information because it is no longer required.
Both scalar and array variables can be unset using the unset command:
unset name
Here name is the name of the variable to unset. For example, the following command unsets the variable FRUIT:
unset FRUIT
The unset command cannot be used to unset variables that have been marked read-only via readonly. There is no way to unset a read-only variable; it persists until the shell exits.