Monitoring Disk I/O
The primary tool to use in troubleshooting disk I/O problems is iostat.
In particular, use iostat -xn 5 during peak usage times to observe the I/O requests being directed at your disk devices.
iostat -xn 5
The system displays the following results:
extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.1 0.0 0.5 0.3 0.0 0.0 77.0 14.0 0 0 c0t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.7 0 0 c0t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.1 0 0 ultra5:vold(pid259) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ultra5:vold(pid259)
NOTE
As with the vmstat command, ignore the first lines of output that are displayed. They represent summary statistics since bootup.
The iostat command outputs columns of data that are described in Table 19.2.
Table 19.2 iostat output
Column |
Description |
r/s |
Reads per second |
w/s |
Writes per second |
Kr/s |
Kilobytes read per second |
Kw/s |
Kilobytes written per second |
Wait |
Average number of transactions waiting for service (queue length) |
actv |
Average number of transactions actively being serviced (removed from the queue but not yet completed) |
svc_t |
Average service time, in milliseconds |
%w |
Percent of time there are transactions waiting for service (queue nonempty) |
%b |
Percent of time the disk is busy (transactions in progress) |
device |
The device name of the disk |
If you are seeing svc_t (service time) values of more than 30ms on disks that are in use (more than 10% busy), the end user will see noticeably sluggish performance.
If a disk is more than 60% busy over sustained periods of time, this can also indicate overuse of that resource.
iostat sometimes reports excessive svc_t (service time) readings for disks that are inactive. This is because fsflush (a kernel activity) tries to keep the data in memory and on the disk up-to-date. Because many writes are specified over a very short period of time to random parts of the disk, a queue forms briefly, and the average service time goes up. svc_t should only be taken seriously on a disk that is showing 5% or more activity.
The "wait" time reported by iostat refers to time spent by a process while waiting for block device (such as disk) I/O to finish. If iostat consistently reports %w > 5, the disk subsystem is too busy. In this case, one thing that is sometimes done is to reduce the size of the wait queue by setting the kernel parameter sd_max_throttle to 64 in the /etc/system file. This is a temporary solution until one of the following permanent remedies can be implemented.
NOTE
sd_max_throttle, a sd driver tunable parameter, determines the maximum number of commands that can be queued up by sd to be submitted to the host bus adapter (HBA) driver. By default, sd_max_throttle is 256. In Solaris, when the disk controller is fully populated with targets or having very fast disks (for example, RAID devices), commands can be queued up too fast for an sd driver to handle (and reach the limit of 256). After this condition is met, SCSI transport failure messages often are displayed.
Another possible cause of I/O problems is SCSI starvation in which low-SCSI-ID devices receive a lower precedence than higher-numbered devices (such as a tape drive). The SCSI target numbers represent attachment points on the SCSI chain. On a narrow SCSI interface, each target number might include as many as eight devices. Higher SCSI target numbers receive better service. On a narrow bus, the target priorities run 7 -> 0. On a wide bus, they run 7 -> 0 and then 15 -> 8. The host adapter is usually 7. This can cause problems where busy disks and tape devices share a SCSI bus because tape devices are usually assigned target 6.
Another indication of trouble with the disk I/O subsystem is when the procs/b section of vmstat persistently reports a number of blocked processes. This column of information is comparable to the run queue (procs/r) reported by vmstat.
The usual solutions to a disk I/O problem are as follows:
Spread out the I/O traffic across more disks either by striping the file system (using Solaris Volume Manager or StorEdge Volume Manager) or by splitting up the data across additional file systems on other disks (or even across other servers).
Redesign the application to reduce the number of disk I/Os. Caching is one frequently used strategy, either via cachefs or application-specific caching.
Tweak various kernel parameters such as the write throttle to provide better performance if there are large amounts of sequential write activity. Another parameter is tune_t_fsflushr, which sets how frequently each bit of memory is checked. Setting fsflush to run less frequently can reduce disk activity, but it does run the risk of losing data that has been written to memory. These parameters can be adjusted using mdb while looking for an optimum value, and then you set the values in the /etc/system file.
CAUTION
Modifying kernel parameters can cause undesirable effects and can even cause the system to crash. Do not attempt to modify kernel parameters without a thorough understanding of the possible implications.
Check for SCSI starvation, meaning check for busy high-numbered SCSI devices (such as tape drives) that have a higher priority than lower-numbered devices.
Database I/O should be done to raw disk partitions or direct unbuffered I/O.