A Closer Look with du
The df command is one you'll use often as you get into the groove of system administration work. In fact, some sysadmins have df e-mailed to them every morning from cron so they can keep a close eye on things. Others have it as a command in their .login or .profile configuration file so they see the output every time they connect.
Once you're familiar with how the disks are being utilized in your Unix system, however, it's time to dig a bit deeper into the system and ascertain where the space is going.
Task 3.2: Using du to Ascertain Directory Sizes
The du command shows you disk usage, helpfully enough, and it has a variety of flags that are critical to using this tool effectively.
-
There won't be a quiz on this, but see if you can figure out what the default output of du is here when I use the command while in my home directory:
# du 12 ./.kde/Autostart 16 ./.kde 412 ./bin 36 ./CraigsList 32 ./DEMO/Src 196 ./DEMO 48 ./elance 16 ./Exchange 1232 ./Gator/Lists 4 ./Gator/Old-Stuff/Adverts 8 ./Gator/Old-Stuff 1848 ./Gator/Snapshots 3092 ./Gator 160 ./IBM/i 136 ./IBM/images 10464 ./IBM 76 ./CBO_MAIL 52 ./Lynx/WWW/Library/vms 2792 ./Lynx/WWW/Library/Implementation 24 ./Lynx/WWW/Library/djgpp 2872 ./Lynx/WWW/Library 2880 ./Lynx/WWW 556 ./Lynx/docs 184 ./Lynx/intl 16 ./Lynx/lib 140 ./Lynx/lynx_help/keystrokes 360 ./Lynx/lynx_help 196 ./Lynx/po 88 ./Lynx/samples 20 ./Lynx/scripts 1112 ./Lynx/src/chrtrans 6848 ./Lynx/src 192 ./Lynx/test 13984 ./Lynx 28484 .
If you guessed that it's the size of each directory, you're right! Notice that the sizes are cumulative because they sum up the size of all files and directories within a given directory. So the Lynx directory is 13,984 somethings, which includes the subdirectory Lynx/src (6,848), which itself contains Lynx/src/chrtrans (1112).
The last line is a summary of the entire current directory (.), which has a combined size of 28484.
And what is that pesky unit of measure? Unfortunately, it's different in different implementations of Unix so I always check the man page before answering this question. Within RHL7.2, the man page for du reveals that the unit of measure isn't specifically stated, frustratingly enough. However, it shows that there's a -k flag that forces the output to 1KB blocks, so a quick check
# du -k | tail -1 28484 .
produces the same number as the preceding, so we can safely conclude that the unit in question is a 1KB block. Therefore, you can see that Lynx takes up 13.6MB of space, and that the entire contents of my home directory consume 27.8MB. A tiny fraction of the 15GB /home partition!
NOTE
Of course, I can recall when I splurged and bought myself a 20MB external hard disk for an early computer. I couldn't imagine that I could even fill it, and it cost more than $200 too! But I'll try not to bore you with the reminiscence of an old-timer, okay?
-
The recursive listing of subdirectories is useful information, but the higher up you go in the file system, the less helpful that information proves to be. Imagine if you were to type du / and wade through the output:
# du / | wc -l 6077
That's a lot of output!
Fortunately, one of the most useful flags to du is -s, which summarizes disk usage by only reporting the files and directories that are specified, or . if none are specified:
# du -s 28484 . # du -s * 4 badjoke 4 badjoke.rot13 412 bin 4 browse.sh 4 buckaroo 76 CBO_MAIL 36 CraigsList 196 DEMO 48 elance 84 etcpasswd 16 Exchange 3092 Gator 4 getmodemdriver.sh 4 getstocks.sh 4 gettermsheet.sh 0 gif.gif 10464 IBM 13984 Lynx
Note in the latter case that because I used the * wildcard, it matched directories and files in my home directory. When given the name of a file, du dutifully reports the size of that file in 1KB blocks. You can force this behavior with the -a flag if you want.
TIP
Tip - The summary vanishes from the bottom of the du output when I specify directories as parameters, and that's too bad, because it's very helpful. To request a summary at the end, simply specify the -c flag.
-
While we're looking at the allocation of disk space, don't forget to check the root level, too. The results are interesting:
# du -s / 1471202 /
Oops! We don't want just a one-line summary, but rather all the directories contained at the topmost level of the file system. Oh, and do make sure that you're running these as root, or you'll see all sorts of odd errors. Indeed, even as root the /proc file system will sporadically generate errors as du tries to calculate the size of a fleeting process table entry or similar. You can ignore errors in /proc in any case.
One more try:
# du -s /* 5529 /bin 3683 /boot 244 /dev 4384 /etc 29808 /home 1 /initrd 67107 /lib 12 /lost+found 1 /misc 2 /mnt 1 /opt 1 /proc 1468 /root 8514 /sbin 12619 /tmp 1257652 /usr 80175 /var 0 /web
That's what I seek. Here you can see that the largest directory by a significant margin is /usr, weighing in at 1,257,652KB.
Rather than calculate sizes, I'm going to use another du flag (-h) to ask for human-readable output:
# du -sh /* 5.4M /bin 3.6M /boot 244k /dev 4.3M /etc 30M /home 1.0k /initrd 66M /lib 12k /lost+found 1.0k /misc 2.0k /mnt 1.0k /opt 1.0k /proc 1.5M /root 8.4M /sbin 13M /tmp 1.2G /usr 79M /var 0 /web
Much easier. Now you can see that /usr is 1.2GB in size, which is quite a lot!
-
Let's use du to dig into the /usr directory and see what's so amazingly big, shall we?
# du -sh /usr/* 121M /usr/bin 4.0k /usr/dict 4.0k /usr/etc 40k /usr/games 30M /usr/include 3.6M /usr/kerberos 427M /usr/lib 2.7M /usr/libexec 224k /usr/local 16k /usr/lost+found 13M /usr/sbin 531M /usr/share 52k /usr/src 0 /usr/tmp 4.0k /usr/web 103M /usr/X11R6
It looks to me like /usr/share is responsible for more than half the disk space consumed in /usr, with /usr/bin and /usr/X11R6 the next largest directories.
You can easily step into /usr/share and run du again to see what's inside, but before we do, it will prove quite useful to take a short break and talk about sort and how it can make the analysis of du output considerably easier.
-
Before we leave this section to talk about sort, though, let's have a quick peek at du within the Darwin environment:
# du -sk * 5888 Desktop 396760 Documents 84688 Library 0 Movies 0 Music 31648 Pictures 0 Public 32 Sites
Notice that I've specified the -k flag here to force 1KB blocks (similar to df, the default for du is 512-byte blocks). Otherwise, it's identical to Linux.
The du output on Solaris is reported in 512-byte blocks unless, like Darwin, you force 1KB blocks with the -k flag:
# du -sk * 1 bin 1689 boot 4 cdrom 372 dev 13 devices 2363 etc 10 export 0 home 8242 kernel 1 lib 8 lost+found 1 mnt 0 net 155306 opt 1771 platform 245587 proc 5777 sbin 32 tmp 25 TT_DB 3206 users 667265 usr 9268 var 0 vol 9 xfn
This section has demonstrated the helpful du command, showing how -a, -s, and -h can be combined to produce a variety of different output. You've also seen how successive du commands can help you zero in on disk space hogs, foreshadowing the diskhogs shell script we'll be developing later in this hour.