- The Network Is the Filesystem, or Something Like That
- NFS, the Default Network Filesystem
- OpenAFS: Complex but Amazingly Powerful
- Summary
NFS, the Default Network Filesystem
Sun Microsystems' Network File System, better known simply as NFS, is the most common networked filesystem in use today, largely because it comes pre-installed and free with almost every Unix and Unix-like system. NFS clients and servers are also available for almost every type of modern computer system, including DOS, Windows, and Macintosh. Developed at Sun Microsystems in the early 1980s, the NFS protocol has been revised and enhanced a number of times between then and now. Its specifications have been publicly available since shortly after it was first released, making NFS a de facto standard for distributed filesystems.
NFS is amazingly easy to configure and activate, and provides a very effective out-of-the-box distributed computing environment for sites that do not want the administrative overhead of installing, configuring, and administering a more complex but higher-performance distributed filesystem.
The underlying network communication method used by NFS is known as Remote Procedure Calls (RPCs), which can use either the lower level UDP (Universal Datagram Protocol) as their network transport mechanism (NFS version 2) or TCP (NFS version 3 and greater). RPCs are a client/server communication method that involves issuing RPC calls with various parameters on client systems, which are actually executed on the server. The client doesn't need to know whether the procedure call is being executed locally or remotely, receiving the results of the RPC in exactly the same way that it would receive the results of a local procedure call.
RPCs work by using a technique known as marshalling, which essentially means packaging up all the arguments to the RPC on the client into a mutually agreed-on format. This format, known as XDR (eXternal Data Representation), provides a sort of computer Esperanto that enables systems with different architectures and byte orders to safely exchange data with each other. Clients marshall their data into XDR packets that are unmarshalled by the server. After executing the requested procedure, the server marshalls its response and ships it back to the client, which unmarshalls the XDR packet and does whatever it should with the data that it has received. Marshalling and unmarshalling, plus the use of the common XDR data representation, make it possible for different types of systems to transparently communicate and execute functions on each other.
NFS servers are stateless, meaning that they do not retain information about each other across system restarts. If a server crashes while a client is attempting to make an RPC to it, the client retries the RPC a few times and then gives up. Stateless operation makes the NFS protocol much simpler because it does not have to worry about maintaining consistency between client and server data. The client is always right, even after rebooting, because it does not maintain any data at that point.
Although stateless operation simplifies things, it is also noisy, inefficient, and slow. When data from a client is saved back to a server, the server must write it synchronously, not returning control to the client until all the data has been saved to the server's disk. Newer versions of NFS do some limited write caching on clients to return control to the client applications as quickly as possible. This is done by storing pending writes to NFS servers in the hopes that it can bundle groups of them together and thus optimize the client's use of the network. In the current standard version of NFS (NFS version 3), cached client writes are still essentially dangerous because they are stored only in memory, and will therefore be lost if the client crashes before the write completes.
The lack of client-side caching also has a long-term operational impact because it limits the type of dependencies that NFS clients can have on NFS servers. Because clients do not cache data from the server, they must re-retrieve any information that they need after any reboot. This can definitely slow the reboot process for any client that must execute binaries located on an NFS server as part of the reboot process. If the server is unavailable, the client cannot boot. For this reason, most NFS clients must contain a full set of system binaries, and typically share only user-oriented binaries and data via NFS.
Though NFS has been around almost since the beginning of Unix workstation history, NFS is by no means a done deal. NFS 4, which is actively under development, resolves the biggest limitations of NFS 3, most notably adding real client-side data caching that survives reboots.
As with any distributed filesystem, all the NFS servers and clients in a single computing environment must use consistent user and groups IDs to ensure that files created by a user on one system are still owned by that user when he or she uses another workstation in the same computing environment. The simplest way to do this is to use the same password file on all your systems, but this quickly becomes an administrative nightmare. Each time you add a user to one of your machines, you would have to add that same user to every other password and group file on every other machine. This is essentially unmanageable as soon as you add the second user to your computing environment.
To remove the need for this sort of administrative nightmare, Sun introduced the Network Information System (NIS). NIS is a distributed authentication mechanism that provides a centralized repository of information that can be accessed over the network from any participating machine in the local administrative entity, which is known as an NIS domain. NIS was originally known as the "Yellow Pages," but was quickly renamed NIS because of a name/copyright conflict with another commercial entity whose name escapes me at the moment.
As with NFS, Sun was wise enough to release the basic specifications of NIS to the public, causing NIS to be made available on most Unix and Unix-like platforms, then and now. Much like NFS, this has made NIS the most commonly used and popular distributed authentication mechanism today. You can't beat the price or easy availability because NIS comes free with most Unix and Unix-like operating systems. NIS+, also from Sun Microsystems, is the successor to NIS. It organizes information hierarchically, much like LDAP. NIS+ was designed to handle the requirements of today's larger, more complex networks. Unfortunately, not all Unix and Unix-like operating system vendors who provide NFS also provide NIS+. Linux is a good example of this, which rules NIS+ out for serious use as far as I'm concerned. It's not like we're all stuck using Solaris any more.
NFS is a great out-of-the-box distributed computing solution that works on almost every platform and, along with NIS, is easy to set up, configure, and manage. Unfortunately, this simplicity brings an associated lack of elegance. I have major problems with its lack of support for stateless operation, its lack of support for disconnected operation (when a client must operate without access to the NFS file server), and its lack of an integrated volume management system and associated utilities for replicating data across servers. Other distributed filesystemssuch as OpenAFS, Coda, and InterMezzoare available to address these limitations, and provide more powerful (though more complex) solutions for distributed computing. Still, it's hard to argue with the fact that NFS "just works"one of the rarest things you can say about any computer software package or subsystem!