- Using Functions
- Understanding Scope, Recursion, Return Codes, and Data Sharing
- Summary
- Terms
Understanding Scope, Recursion, Return Codes, and Data Sharing
Now that you have a basic understanding of the use and operation of functions in shell scripts, let's look at more advanced topics such as scope, recursion, return codes, and data sharing.
Scope
The term scope refers to the region within a program where a variable's value can be accessed. There are two types of scope:
Global scope If a variable has global scope, its value can be accessed from anywhere within a script. Variables with global scope are referred to as global variables.
Local scope If a variable has local scope, its value can only be accessed within the function in which it is declared. Variables with local scope are referred to as local variables.
By default all variables, except for the special variables associated with function arguments, have global scope. In ksh, bash, and zsh, variables with local scope can be declared using the typeset command. The typeset command is discussed later in this chapter. This command is not supported in the Bourne shell, so it is not possible to have programmer-defined local variables in scripts that rely strictly on the Bourne shell.
Global Variables
The following script illustrates the behavior of global variables:
#!/bin/sh pearFunc () { pear=2; # set $pear echo "In pearFunc(): pear is $pear" # print out its value } pearFunc # call pearFunc echo "Outside of pearFunc(): pear is $pear" # print out $pear
First the script defines a function, pearFunc, that sets the value of the global variable $pear (all variables are global by default) and outputs that value. Then the script executes pearFunc. Finally, the script prints the value of $pear outside of the function. The output is
In pearFunc(): pear is 2 Outside of pearFunc(): pear is 2
As you can see from the output, the value assigned to the variable $pear in the function pearFunc is accessible outside of pearFunc.
A common use for global variables is to communicate information from a function to the main script, as illustrated in the following script:
#!/bin/sh readPass () { PASS="" # clear password echo -n "Enter Password: " # print the prompt stty echo # turn off terminal echo to prevent peeping! read PASS # read the password stty echo # restore terminal echo echo # printout a new line to make output nice } readPass echo Password is $PASS
This script uses the readPass function to read in a password from the user. The readPass function reads the password and stores it in the global variable PASS. The script then accesses the password using the variable PASS.
The readPass function is quite simple. It function starts by undefining PASS. Then it issues a prompt for the password and deactivates terminal echo using the stty echo command. Terminal echo is deactivated because you don't want someone other than the user to inadvertently see the password. Next, you read the password and store its value in PASS by using the read command. Finally, you restore terminal echo using the stty echo command and echo a new line.
Local Variables
Local variables are defined using typeset command:
typeset var1[=val1] ... varN[=valN]
Here, var1 ... varN are variable names and val1 ... valN are values to assign to the variables. The values are optional as the following example illustrates:
typeset fruit1 fruit2=banana
This command declares two local variables, fruit1 and fruit2, and assigns the value banana to the variable fruit2.
The following script illustrates the behavior of local variables:
#!/bin/sh pearFunc () { typeset pear=2; # set $pear echo "In pearFunc(): pear is $pear" # print out its value } pearFunc # call pearFunc echo "Outside of pearFunc(): pear is $pear" # print out $pear
First, the script defines a function, pearFunc, which sets the value of a local variable $pear and outputs that value. Then the script executes pearFunc. Finally, the script prints the value of $pear outside of the function. The output is
In pearFunc(): pear is 2 Outside of pearFunc(): pear is
From the output, you can see that when the value of $pear is accessed within the pearFunc it has the value 2, but when the value of $pear is accessed outside the function, it has no value.
Recursion
In the previous section, you learned about the concept of function chaining, where one function calls another function. Recursion is a special instance of function chaining in which a function calls itself. The following example illustrates the use of recursion:
reverse() { if [ $# -gt 0 ] ; then typeset arg="$1" shift reverse "$@" echo "$arg " fi } reverse "$@"
This script prints its arguments in reverse order. It does so by calling the function reverse with $@ as the arguments. The reverse function is really simple; it determines whether there are any arguments. If there are no arguments, the function does nothing. Otherwise, it saves the first argument, removes it from the argument list using shift and calls itself. Once this call returns, the function just prints the argument it saved.
If you name the script reverse.sh and execute it with the arguments a b c, as follows:
#!/bin/sh readPass () { PASS="" # clear password echo -n "Enter Password: " # print the prompt stty echo # turn off terminal echo to prevent peeping! read PASS # read the password stty echo # restore terminal echo echo # printout a new line to make output nice } readPass echo Password is $PASS
The output is
c b a
NOTE
In the previous example, you executed the script using /bin/sh. This will not work on Solaris and SunOS systems. On those systems, you need to execute the script using /bin/ksh rather than /bin/sh as /bin/sh on Solaris does not support the typeset command.
The execution of this script proceeds as follows:
The script executes reverse "$@" (effectively it calls reverse a b c).
The function reverse determines whether $# (the number of arguments) is greater than 0. In this case, $# will be equal to 3 (a b c).
Because $# is greater than 0, reverse saves the first argument, $1 (in this case a) in the local variable $arg, and then calls shift to remove it from $@. Now, $@ holds two arguments, b c.
The function reverse calls itself with the shortened $@.
The function reverse determines whether $# (the number of arguments) is greater than 0. In this case, $# will be equal to 2 (b c).
Because $# is greater than 0, reverse saves the first argument, $1 (in this case b) in the local variable $arg, and then calls shift to remove it from $@. Now, $@ holds just one argument, c.
The function reverse calls itself with the shortened $@.
The function reverse determines whether $# (the number of arguments) is greater than 0. In this case, $# will be equal to 1.
Because $# is greater than 0, reverse saves the first argument, $1 (in this case c) in the local variable $arg, and then calls shift to remove it from $@. Now, $@ holds no arguments.
The function reverse calls itself with the shortened $@.
The function reverse determines whether $# (the number of arguments) is greater than 0. Because there are no arguments in $@, this check fails and the function returns.
After the call to reverse returns, you output the value of the local variable $arg, in this case c, and return.
After the call to reverse returns, you output the value of the local variable $arg, in this case b, and return.
After the call to reverse returns, you output the value of the local variable $arg, in this case a, and return.
Divide and Conquer
Recursion is normally used to solve problems using a technique known as divide and conquer. Basically, divide and conquer means that a problem is divided into smaller and smaller instances until an instance that is small enough to solve directly is found. Each instance that is too big to solve directly is solved recursively, and the solutions are combined to produce a solution to the original problem.
You used divide and conquer in the previous example; the function reverse kept calling itself with smaller and smaller parts of the argument list $@ until all the arguments were exhausted, and then it just printed each argument.
Return Codes
When a shell script completes, it can use the exit command to return exit status via an exit code. The function analogue to exit is the return command. This command allows function to return exit status. The exit status from a function is called its return code. The convention for return codes is the same as for exit codes; a 0 equals success and a nonzero equals failure.
The syntax of the return command is
return rc
Here rc is the return code. The following function illustrates the use of return:
isInteractive () { case $- in # $- holds the invocation options *i*) return 0;; # if $- contains i, the shell is interactive esac return 1 }
You can use this function to detect whether a particular shell is interactive as follows:
if isInteractive ; then echo "Interactive shell" else echo "Non-interactive shell" fi
Data Sharing
The functions you have seen thus far are mostly independent, but in most shell scripts functions either depend on or share data with other functions. In this section, you will look at an example in which three functions work together and share data.
Moving Around the File System
The C shell, csh, introduced three commands for quickly moving around in the UNIX directory tree:
popd
pushd
dirs
These commands maintain a stack of directories internally and enable the user to add and remove directories from the stack and list the contents of the stack.
Understanding Stacks
For those readers who are not familiar with the programming concept of a stack, you can think of it as a stack of plates: you can add or remove a plate only at the top of the stack. You can access only the top plate, not any of the middle plates in the stack. A stack in programming terms is similar. You can add or remove an item only at the top of the stack.
These commands are not available in Bourne shell or ksh. Newer versions of bash and zsh have introduced these commands. In this section, you will implement each of these commands as shell functions so that they can be used with any Bourne-like shell.
In csh, the directory stack used by these commands is maintained within the shell; in this implementation you will maintain the stack as an global exported environment variable, called _DIR_STACK. The entries in _DIR_STACK are separated by colons, :, just like entries in PATH or MANPATH. This allows you to handle almost any directory name.
Implementing dirs
First let's look at the simplest of the three functions, dirs. This function just lists the entries in the directory stack:
dirs() { OLDIFS="$IFS" # save IFS (internal field separator) IFS=: # set IFS to :, so that we can process # each entry in _DIR_STACK easily for i in $_DIR_STACK # print out each entry in _DIR_STACK do echo "$i \c" done echo # print out new line (makes output pretty) IFS="$OLDIFS" # restore IFS }
First, you save the current value of IFS in OLDIFS and then you set IFS to :. Because IFS is the Internal Field Separator for the shell, modifying it allows you to use the for loop to cycle through the individual entries in _DIR_STACK. When you are finished with all the entries, you restore the value of IFS.
NOTE
The shell uses the value of the variable IFS to split up a string into separate words. The default setting for IFS is the space and tab characters. This enables the shell to determine the number of words that are in most strings. Normally, the shell uses the default value of IFS to determine how many options are supplied to a command, script, or shell function along with how many items are specified to a for loop.
In the previous example, you manipulated the value of IFS in order to simplify the processing of the entries in _DIR_STACK.
Implementing pushd
The pushd function is a bit more complicated than the dirs function. In addition to listing the directories in the stack, it must also change to a requested directory and then add that directory to the top of the stack. The requested directory is the first argument to the function. If an argument is not specified, the current directory (.) is used.
This example implements pushd as follows:
pushd() { # set REQ to the first argument (if given, otherwise use .) REQ="${1:-.}" # if $REQ is not a directory, print an error and return if [ ! -d "$REQ" ] ; then echo "ERROR: $REQ is not a directory." 1>&2 return 1 fi # if we can cd to $REQ, update _DIR_STACK and print it out # otherwise print an error and return if cd "$REQ" > /dev/null 2>&1 ; then _DIR_STACK="´pwd´:$_DIR_STACK" ; export _DIR_STACK ; dirs else echo "ERROR: Cannot change to directory $REQ." >&2 return 1 fi unset REQ }
This function starts by determining the directory to push onto the stack. It uses the default value substitution form of variable substitution, covered in Chapter 9, "Substitution," to obtain this value. Then, the function determines whether the requested directory is really a directory. If it is not a directory, you print an error and return 1 to indicate failure. Otherwise, you change to that directory and then update the directory stack with the full path of the new directory. You have to use the full path rather than value in $REQ, because the value stored in $REQ might be a relative path. After the directory stack has been updated, you call dirs to output the directories stored on the stack.
Implementing popd
The popd() function is much more complicated than the other two functions. Let's look at the operations it performs:
Removes the first entry from the directory stack
Updates the directory stack to reflect the removal
Changes to the directory indicated by the entry that was removed from the stack
Displays the full path of the current directory
To simplify the first and second operations, you can implement a helper function for popd() called _popd_helper(). This function performs all the work; popd() is simply a wrapper around it. Often you need to write functions in this manner: one function that provides a simple interface and another that performs the actual work.
Implementing _popd_helper
Let's first look at the function _popd_helper to see how the directory stack is manipulated:
_popd_helper() { # set the directory to pop to the first argument, if # this directory is empty, issue an error and return 1 # otherwise get rid of POPD from the arguments POPD="$1" if [ -z "$POPD" ] ; then echo "ERROR: The directory stack is empty." >&2 return 1 fi shift # if any more arguments remain, reinitalize the directory # stack, and then update it with the remaining items, # otherwise set the directory stack to null if [ -n "$1" ] ; then _DIR_STACK="$1" ; shift ; for i in $@ ; do _DIR_STACK="$_DIR_STACK:$i" ; done else _DIR_STACK= fi # if POPD is a directory cd to it, otherwise issue # an error message if [ -d "$POPD" ] ; then cd "$POPD" > /dev/null 2>&1 if [ $? -ne 0 ] ; then echo "ERROR: Could not cd to $POPD." >&2 fi pwd else echo "ERROR: $POPD is not a directory." >&2 fi export _DIR_STACK unset POPD }
This function expects each of the directories in the directory stack to be given to it as arguments, so the first thing that it checks is whether $1, the first argument, has any value. You do this by setting $POPD equal to $1 and then checking if $POPD has a value. If the directory stack is empty, you issue an error message and return; otherwise, you shorten the stack using shift. At this point, you have taken care of the first operation.
Next, you determine whether the directory stack became empty after you removed an entry from it. Because the individual items in the stack are the arguments to this function, you need to check whether $1, the new first argument, has a value. If it does, you reinitialize the directory stack with this value and proceed to add all the remaining values back onto the stack; otherwise, you set the value of the directory stack to null. At this point, you have taken care of the second operation.
The final if statement takes care of the third and fourth operations. Here, you determine whether the path stored in $POPD is a directory. This check is required because the path might have been removed from the system after it was added to the directory stack. If the path is a directory, you try to cd to that directory. If the change is successful, you print the full path to the directory, otherwise you print an error message.
The Wrapper Function
Now that you know how the helper function works, you can write an appropriate wrapper function to translate the value of _DIR_STACK into separate arguments. This is fairly easy, thanks to IFS.
The popd() function is
popd() { OLDIFS="$IFS" IFS=: _popd_helper $_DIR_STACK IFS="$OLDIFS" }
In this function, you first save the old value of IFS. Then you set IFS to : and call _popd_helper with the directory stack specified as arguments. After _popd_helper returns, you restore the value of IFS.