Adding Structure
I’ve talked about files and how the range of disk blocks for a file is found, but how is the file’s inode or the first block in a FAT filesystem found? With a directory. A phone directory is a big book containing a list of name-to-number mappings. A filesystem directory is a very similar concept, except with names of files rather than people, and block or inode numbers rather than phone numbers.
A directory in a UNIX system is simply a file containing a list of filenames and inode numbers. On a FAT filesystem, the file’s metadata also is stored in the directory, but apart from that difference the two schemes are very similar.
One reason UNIX makes attributes in the inode instead of the directory is to allow hard links. Since all of the information about a file (apart from its name) is stored in the inode, you can easily create two directory entries for the same file, and the two entries will be alike in all respects except for the name. There’s no distinction between hard links and "real files." When a link is created, the file’s reference count is incremented; when the link is destroyed, the count is decremented. When the count reaches zero (and no process has the file open), the file is deleted.
Traditionally, UNIX didn’t allow hard links to directories. The reason is quite simple: Without hard links to directories, it’s impossible to create loops in the directory structure. This makes it easy for backup tools to copy every file in the filesystem. With hard links on directories, the backup tool would have to store the inode number of every file it had copied, and copy and then check each new file inode to make sure that it was really new.
With OS X 10.5, Apple allowed hard links to directories. Time Machine uses this feature for storing incremental backups, by hard-linking directories whose contents haven’t changed to the version in the previous backup.
UNIX also allows another kind of links—symbolic links (often called soft links). These links are distinguishable from real files in that they have some metadata set indicating that they’re links (without this, they’re just regular files containing a path). A number of system calls have options to follow or not follow links. This mechanism was almost used in Windows 95 for shortcuts, but was implemented in Explorer rather than in the standard low-level file operations and therefore wasn’t very useful. Since symbolic links point to filenames, they can point to files on other volumes, which hard links can’t do.
Mac OS introduced a third, more elegant form of links, known as aliases. HFS stored a unique ID for every file on the drive. Unlike the inode number, this ID was globally unique, not just unique within the filesystem. Aliases are like symbolic links, except that they point to the file ID rather than the name. Moving the file won’t break aliases, as it would soft links.
Symbolic links and aliases can both cross filesystem boundaries, while hard links cannot. This means that both can be used to point to files on networked or removable drives that may not be present.