- Shell and Command Questions
- Variable and Argument Questions
- File and Directory Questions
- Summary
File and Directory Questions
This section looks at some questions about files and directories. These questions include issues with specific commands and examples that illustrate the use of commands to solve particular problems.
How do I determine the absolute pathname of a directory?
Shell scripts that work with directories often need to determine the absolute pathname of a directory to perform the correct operations on these directories.
You can determine the absolute pathname of a directory by using the cd and pwd commands as follows:
ABSPATH=´(cd dir 2> /dev/null && pwd ;)´
Here dir is the name of a directory. This command changes directories to the specified directory, dir, and then displays the full pathname of the directory using the pwd command. Then you assign the output of pwd, which is the full path to dir, to the variable ABSPATH. Because the cd command changes the working directory of the current shell, you execute it in a sub-shell. Thus the working directory of the shell script is unchanged.
The following function also provides this functionality:
abspath () { [ -n "$1" ] && ( cd "$1" 2> /dev/null && pwd ; ) }
Here, you determine whether the first argument is given and if it is, you cd to that directory and print its absolute path.
How do I determine the absolute pathname of a file?
Determining the absolute pathname of a file is slightly harder than determining the absolute pathname of a directory. You need to use the dirname and basename commands in conjunction with the cd and pwd commands to determine the absolute pathname of a file:
CURDIR=´pwd´ cd ´dirname file´ ABSPATH="´pwd´/´basename file´" cd $CURDIR
Here file is the name of a file whose absolute pathname you want to determine. First you save the current path of the current directory in the variable CURDIR. Next you move to the directory containing the specified file, file.
Then you join the output of the pwd command and the name of the file determined using the basename command to get the absolute pathname. At this point the absolute pathname of the file is stored in the variable ABSPATH. Finally you change back to the original directory.
As an example, the following function implements this functionality:
absfpath () { if [ -z "$1" ] ; then return 1 fi CURDIR="´pwd´" cd "´dirname $1´" ABSPATH="´pwd´/´basename $1´" cd "$CURDIR" }
How can I locate a particular file?
The structure of the UNIX directory tree sometimes makes locating files and commands difficult. To locate a file, often you need to search through a directory and all its subdirectories. The easiest way to do this is with the find command:
find dir -name file -print
Here dir is the name of a directory where find should start its search, and file is the name of the file it should look for.
The name option of the find command also works with the standard filename substitution operators covered in Chapter 9. For example, the command
find /home/ranga -name "*.txt" -print
displays a list of all the files in the directory /home/ranga and all its subdirectories that end with the string .txt.
How can I grep for a string in every file in a directory?
When you work on a large project involving many files, remembering the contents of the individual files becomes difficult. It is much easier to look through all the files for a particular piece of information.
You can use the find command in conjunction with the xargs command to look for a particular string in every file contained within a directory and all its subdirectories:
find dir -type f -print | xargs grep "string"
Here dir is the name of a directory in which to start searching, and string is the string to look for. Here you specify the -type option to the find command so that only regular files are searched for the string. As an example, the following command searches all of the C language include files in /usr/include for the string pid_t:
$ find /usr/include -type f -print | xargs grep pid_t
How do I remove all the files in a directory matching a particular name?
Some editors and programs create large numbers of temporary files. Often you need to clean up after these programs, to prevent your hard drive from filling up. The simplest method to remove a set of files that matches a particular name is to use the find and xargs commands as follows:
find dir -type f -name "name" -print | xargs rm
Here dir is the pathname of a directory and name is the filename that you want to remove. For example, the following command removes all of the files that end with ~ from the directory /home/cvs:
find /home/cvs -type f -name "*~" -print | xargs rm
The only limitation in using find and xargs is that xargs cannot properly deal with pathnames that contain spaces. If you need to delete files whose pathnames contain spaces you will need to use the -exec option of find rather than xargs:
find dir -type f -name "name" -exec rm '{}' \; -print
What command can I use to rename all the *.aaa files to *.bbb files?
In DOS and Windows, you can rename all the *.aaa files in a directory to *.bbb by using the rename command as follows:
rename *.aaa *.bbb
In UNIX you can use the mv command to rename files, but you cannot use it to rename more than one file at the same time. To do this, you need to use a for loop:
OLDSUFFIX=aaa NEWSUFFIX=bbb for FILE in *."$OLDSUFFIX" do NEWNAME=´echo "$FILE" | sed -e "s/${OLDSUFFIX}\$/$NEWSUFFIX/"´ mv "$FILE" "$NEWNAME" done
Here you generate a list of all the files in the current directory that end with the value of the variable OLDSUFFIX. Then you use sed to modify the name of each file by removing the value of OLDSUFFIX from the filename and replacing it with the value of NEWSUFFIX. You use the $ character in our sed expression to anchor the suffix in OLDSUFFIX to the end of the line; this ensures that the pattern is really a filename suffix. After you have the new name, you rename the file from its original name, stored in FILE, to the new name stored in NEWNAME.
To prevent a potential loss of data, you might consider modifying this loop to specify the -i option to the mv command. For example, if the files 1.aaa and 1.bbb exist prior to executing this loop, after the loops exits, the original version of 1.aaa will be overwritten when 1.bbb is renamed as 1.aaa. If mv -i is used, you will be prompted before 1.bbb is renamed:
mv: overwrite 1.aaa (yes/no)?
You can answer no to avoid losing the information in this file. The actual prompt produced by mv might be different on your system.
What command can I use to rename all the aaa* files to bbb* files?
The technique used in the last question can be used to solve this problem as well. In this case, you can use the variables OLDPREFIX to hold the prefix a file currently has and NEWPREFIX to hold the prefix you want the file to have. As an example, you can use the following for loop to rename all files that start with aaa to start with bbb instead:
OLDPREFIX=aaa NEWPREFIX=bbb for FILE in "$OLDPREFIX"* do NEWNAME=´echo "$FILE" | sed -e "s/^${OLDPREFIX}/$NEWPREFIX/"´ mv "$FILE" "$NEWNAME" done
How can I set my filenames to lowercase?
When you transfer a file from a Windows or DOS system to a UNIX system, the filename can end up in all capital letters. You can rename these files to lowercase using the following command:
for FILE in * do mv -i "$FILE" ´echo "$FILE" | tr '[A-Z]' '[a-z]'´ 2> /dev/null done
Here, you are using the mv -i command in order to avoid overwriting files. For example, if the files APPLE and apple both exist in a directory, you might not want to rename the file APPLE.
How do I eliminate carriage returns (^M) in my files?
If you transfer text files from a DOS machine to a UNIX machine, you might see a ^M (Ctrl-M) before the end of each line. This character corresponds to a carriage return. In DOS, a newline is represented by the character sequence \r\n, where \r is the carriage return and \n is newline. In UNIX a newline is represented by just \n. When text files created on a DOS system are viewed in UNIX, the \r is displayed as ^M. The ^M can be removed from a file by using the tr command as follows:
tr -d '\015' < file > newfile
Here file is the name of the file that contains the carriage returns, and newfile is the name you want to give the file after the carriage returns have been deleted. You are using the octal representation \015 for carriage return, because the escape sequence \r is not correctly interpreted by some versions of tr.