- Enabling Debugging
- Using Syntax Checking
- Shell Tracing
- Summary
- Questions
- Terms
Shell Tracing
There are many instances when syntax checking will give your script a clean bill of health, even though bugs are still lurking within it. Running syntax checking on a shell script is similar to running a spelling checker on a text documentit might find most of the misspellings, but it can't fix problems like read spelled red. In order to find and fix these types of errors in a text document, you need to proofread it. Shell tracing is proofreading your shell script.
In shell tracing mode each command is printed in the exact form that it is executed. For this reason, shell tracing mode is often referred to as execution tracing mode. Shell tracing is enabled by the -x option (x as in execution). The following command enables tracing for an entire script:
$ /bin/sh -x script arg1 arg2 ... argN
Tracing can also be enabled using the set command:
set -x
To get an idea of what the output of shell tracing looks like, try the following command:
$ set -x ; ls *.sh ; set +x
The output will be similar to the following:
+ ls buggy.sh buggy1.sh buggy2.sh buggy3.sh buggy4.sh buggy.sh buggy1.sh buggy2.sh buggy3.sh buggy4.sh + set +x
In the output, the lines preceded by the plus (+) character are the commands that the shell executes. The other lines are output from those commands. As you can see from the output, the shell prints the exact ls command it executes. This is extremely useful in debugging because it enables you to determine whether all the substitutions were performed correctly.
Finding Syntax Bugs Using Shell Tracing
In the preceding example, you used the script buggy2.sh. One of the problems with this script is that it deleted the old backup before asking whether you wanted to make a new backup. To solve this problem, the script is rewritten as follows:
#!/bin/sh Failed() { if [ $1 -ne 0 ] ; then echo "Failed. Exiting." ; exit 1 ; fi echo "Done." } YesNo() { echo "$1 (y/n)? \c" read RESPONSE case $RESPONSE in [yY]|[Yy][Ee][Ss]) RESPONSE=y ;; [nN]|[Nn][Oo]) RESPONSE=n ;; esac } YesNo "Make backup" if [ $RESPONSE = "y" ] ; then echo "Deleting old backups, please wait... \c" rm -fr backup > /dev/null 2>&1 Failed $? echo "Making new backups, please wait... \c" cp -r docs backup Failed fi
There are at least three syntax bugs in this script and at least one logical oversight. See if you can find them.
Assuming that the script is called buggy3.sh, first check its syntax as follows:
$ /bin/sh -n ./buggy3.sh
Because there is no output, you can execute it:
$ /bin/sh ./buggy3.sh
The script first prompts you as follows:
Make backup (y/n)?
Answering y to this prompt produces output similar to the following:
Deleting old backups, please wait... Done. Making new backups, please wait... buggy3.sh: test: argument expected
Now you know there is a problem with the script, but the error message doesn't tell you where it is, so you need to track it down manually. From the output you know that the old backup was deleted successfully; therefore, the error is probably in the following part of the script:
echo "Making new backups, please wait... \c" cp -r docs backup Failed
Let's just enable shell tracing for this section:
set -x echo "Making new backups, please wait... \c" cp -r docs backup Failed set +x
The output changes as follows (assuming you answer y to the question):
Make backup (y/n)? y Deleting old backups, please wait... Done. + echo Making new backups, please wait... \c Making new backups, please wait... + cp -r docs backup + Failed + [ -ne 0 ] buggy3.sh: test: argument expected
From this output you can see that the problem occurred in the following statement:
[ -ne 0 ]
From Chapter 11, "Flow Control," you know that the form of a numerical test command is
[ num1 operator num2 ]
Here it looks like num1 does not exist. Also from the trace you can tell that this error occurred after executing the Failed function:
Failed() { if [ $1 -ne 0 ] ; then echo "Failed. Exiting." ; exit 1 ; fi echo "Done." }
There is only one numerical test in this function; the test that compares $1, the first argument to the function, to see whether it is equal to 0. The problem should be obvious now. When Failed was invoked, you forgot to give it an argument:
echo "Making new backups, please wait... \c" cp -r docs backup Failed
Therefore, the numeric test failed. There are two possible fixes for this bug. The first is to fix the code that calls the function:
echo "Making new backups, please wait... \c" cp -r docs backup Failed $?
The second is to fix the function itself by quoting the first argument, "$1":
Failed() { if [ "$1" -ne 0 ] ; then echo "Failed. Exiting." ; exit 1 ; fi echo "Done." }
By quoting the first argument, "$1", the shell uses the null or empty string when the function is called without any arguments. In this case the numeric test will not fail because both num1 and num2 have a value.
The best idea is to perform both fixes. After these fixes are applied, the shell tracing output is similar to the following:
Make backup (y/n)? y Deleting old backups, please wait... Done. + echo Making new backups, please wait... \c Making new backups, please wait... + cp -r docs backup + Failed + [ -ne 0 ] + echo Done. Done. + set +x
Finding Logical Bugs Using Shell Tracing
As mentioned before, there is at least one logical bug in this script. With the help of shell tracing, you can locate and fix this bug.
Consider the prompt produced by this script:
Make backup (y/n)?
If you do not type a response but simply press Enter or Return, the script reports an error similar to the following:
./buggy3.sh: [: =: unary operator expected
To determine where this error occurs, it is probably best to run the entire script in shell tracing mode:
$ /bin/sh -x ./buggy3.sh
The output is similar to the following:
+ YesNo Make backup + echo Make backup (y/n)? \c + /bin/echo Make backup (y/n)? \c Make backup (y/n)? + read RESPONSE + [ = y ] ./buggy3.sh: [: =: unary operator expected
The blank line is the result of pressing Enter or Return without typing a response to the prompt. The next line that the shell executes is the source of the error message:
[ = y ]
Which is part of the if statement:
if [ $RESPONSE = "y" ] ; then
Although this problem can be fixed by just quoting $RESPONSE,
if [ "$RESPONSE" = "y" ] ; then
the better fix is to determine why it is not set and change that code so that it always sets $RESPONSE. Looking at the script, you find that this variable is set by the function YesNo:
YesNo() { echo "$1 (y/n)? \c" read RESPONSE case $RESPONSE in [yY]|[Yy][Ee][Ss]) RESPONSE=y ;; [nN]|[Nn][Oo]) RESPONSE=n ;; esac }
There are two problems here. The first one is that the read command
read RESPONSE
will not set a value for $RESPONSE if the user just presses Enter or Return. Because you can't change the read command, you need to find a different method to solving the problem. Basically you have a logical problemthe case statement needs to validate the user input, which it is currently not doing. A simple fix for the problem is to change YesNo as follows:
YesNo() { echo "$1 (y/n)? \c" read RESPONSE case "$RESPONSE" in [yY]|[Yy][Ee][Ss]) RESPONSE=y ;; *) RESPONSE=n ;; esac }
Now you treat all responses other than "yes" as negative responses. This includes null responses generated when the user simply types Enter or Return.
Using Debugging Hooks
In the previous examples, you were able to deduce the location of a bug using shell tracing. In order to enable tracing for a particular part of the script, you have to edit the script and insert the debug command:
set -x
For larger scripts, a better practice is to embed debugging hooks. Debugging hooks are functions that enable shell tracing in critical code sections. Debugging hooks are normally activated in one of two ways:
The script is run with a command-line option (commonly -d or -x).
The script is run with an environment variable set to true (commonly DEBUG=true or TRACE=true).
The following function enables you to activate and deactivate debugging by setting $DEBUG to true:
Debug() { if [ "$DEBUG" = "true" ] ; then if [ "$1" = "on" -o "$1" = "ON" ] ; then set -x else set +x fi fi }
To activate debugging, you can use the following:
Debug on
To deactivate debugging, you can use either of the following:
Debug Debug off
Actually, passing any argument to this function other than on or ON deactivates debugging.
NOTE
The normal practice, with regard to debugging, is to activate it only when necessary. By default, debugging should be off.
To demonstrate the use of this function, you can modify the functions in the script buggy3.sh to have debugging automatically enabled if the variable DEBUG is set. The modified version of buggy3.sh is as follows:
#!/bin/sh Debug() { if [ "$DEBUG" = "true" ] ; then if [ "$1" = "on" -o "$1" = "ON" ] ; then set -x else set +x fi fi } Failed() { Debug on if [ "$1" -ne 0 ] ; then echo "Failed. Exiting." ; exit 1 ; fi echo "Done." Debug off } YesNo() { Debug on echo "$1 (y/n)? \c" read RESPONSE case "$RESPONSE" in [yY]|[Yy][Ee][Ss]) RESPONSE=y ;; *) RESPONSE=n ;; esac Debug off } YesNo "Make backup" if [ "$RESPONSE" = "y" ] ; then echo "Deleting old backups, please wait... \c" rm -r backup > /dev/null 2>&1 Failed $? echo "Making new backups, please wait... \c" cp -r docs backup Failed $? fi
There is no change in the output if the script is executed in either of the following ways:
$ /bin/sh ./buggy3.sh $ ./buggy3.sh
The output includes shell tracing if the same script is executed in either of the following ways:
$ DEBUG=true /bin/sh ./buggy3.sh $ DEBUG=true ./buggy3.sh