Inside Solaris File Systems
File systems are typically observed as a layer between an application and the I/O services providing the underlying storage. When you look at file system performance, you should focus on the latencies observed at the application level. Historically, however, we have focused on techniques that look at the latency and throughput characteristics of the underlying storage and have been flying in the dark about the real latencies seen at the application level.
With the advent of DTrace, we now have end-to-end observability, from the application all the way through to the underlying storage. This makes it possible to do the following:
- Observe the latency and performance impact of file-level requests at the application level.
- Attribute physical I/O by applications and/or files.
- Identify performance characteristics contributed by the file system layer, in between the application and the I/O services.
5.1 Layers of File System and I/O
We can observe file system activity at three key layers:
- I/O layer. At the bottom of a file system is the I/O subsystem providing the backend storage for the file system. For a disk-based file system, this is typically the block I/O layer. Other file systems (for example, NFS) might use networks or other services to provide backend storage.
-
POSIX libraries and system calls. Applications typically perform I/O through POSIX library interfaces. For example, an application needing to open and read a file would call open(2) followed by read(2).
Most POSIX interfaces map directly to system calls, the exceptions being the asynchronous I/O interfaces. These are emulated by user-level thread libraries on top of POSIX pread/pwrite.
You can trace at this layer with a variety of tools—truss and DTrace can trace the system calls on behalf of the application. truss has significant overhead when used at this level since it starts and stops the application at every system call. In contrast, DTrace typically only adds a few microseconds to each call.
- VOP layer. Solaris provides a layer of common entry points between the upper-level system calls and the file system—the file system vnode operations (VOP) interface layer. We can instrument these layers easily with DTrace. We've historically made special one-off tools to monitor at this layer by using kernel VOP-level interposer modules, a practice that adds significant instability risk and performance overhead.
Figure 5.1 shows the end-to-end layers for an application performing I/O through a file system.
Figure 5.1 Layers for Observing File System I/O