- Introduction
- Performance and Disease
- Business Requirements
- Medical Analogues
- Lab Tests and Record Keeping
- Traps and Pitfalls
- Where Does the Time Go?
- Diagnostic Strategies
- Selected Tools and Techniques
- References
- Third-Party URLs
- Acknowledgments
- About the Author
- Ordering Sun Documents
- Accessing Sun Documentation Online
Selected Tools and Techniques
This section contains a survey of helpful tools for performance diagnosis, along with a description of features for each tool. Each tool has some value for gaining insight into how things work and where the time is going. Many of the tools are useful for routine monitoring and tuning activities. We start with two important yet often overlooked categories, then discuss tools specific to the Solaris OS.
Application-Specific Instrumentation
Most major application components (for example, database engines, transaction monitors, and web servers) feature facilities for performance monitoring and logging. These facilities are an obvious place to look for clues, and they often include metrics that can be tightly correlated with actual business goals. One should also consider these metrics in the overall scheme of health monitoring and capacity planning activities.
Log Files
It can be very embarrassing to find a log file with reams of errors and warnings indicating an obvious problem just after great effort has been expended discovering the same problem by indirect means. It is good practice to be fully familiar with the logging capabilities of various components, and fully aware of how they are configured.
On many occasions, the rate of diagnostic logging, the size of a log file, or contention for log file writing has been the root cause of a broader performance issue. There are often cases where turning up the diagnostic level only exacerbates an existing performance complaint.
ps(1)
In the Solaris OS, the ps(1) command includes features that are frequently overlooked by many users. In particular, the -o option allows selection of a specific set of column headings. For example, you can use the following command to gain insight into the thread-level priorities of all of the threads on a system:
$ ps -e -o pid,ppid,lwp,nlwp,class,pri,args
The Solaris OS also offers a Berkeley UNIX version of the ps command, invoked as /usr/ucb/ps. The man page is available by typing man -s 1b ps. One trick unique to the Berkeley version of ps is the ability to show the command arguments of each process by using multiple -w options, as in the following example:
$ /usr/ucb/ps -www
This command can sometimes help in shedding light on why a process is using excessive resources.
vmstat(1), mpstat(1), and netstat(1)
The stat commands comprise the largest category of performance monitoring tools, and they continue to evolve. For example, in the Solaris 8 OS, the addition of the -p option to the venerable vmstat(1) command greatly simplifies the observation of paging-related phenomena. Among the most useful top-level monitoring tools is mpstat(1). Some information reported by netstat(1) with the -s option is not readily available elsewhere. These tools are basic to spotting CPU, network, and memory utilization issues. They are the workhorses of system management and are covered well by most books pertaining to UNIX systems management.
prstat(1)
At first glance, prstat(1) appears to be a poor substitute for the top12 command. However, prstat has considerable diagnostic capabilities, while top has next to none. Two of the most interesting capabilities of prstat are its ability to do thread-level monitoring with the -L option and its ability to display microstate accounting data with the -m option. These features establish prstat as an important tool for forming hypotheses regarding root causes of performance issues.
Microstate accounting provides detailed information (optionally, at the thread level) on the breakdown of wall-clock time. As reported by prstat, the microstates are:
USR (percentage of time in non-trap user mode CPU mode)
SYS (percentage of time in kernel mode CPU mode)
TRP (percentage of time in user-mode trap handlers)
TFL (percentage of time waiting for text-page faults)
DFL (percentage of time waiting for data-page faults)
LCK (percentage of time waiting on user-mode locks)
SLP (percentage of time sleeping for other causes)
LAT (percentage of time waiting for CPU)
In addition, microstate data includes some rate data of great interest for performance analysis:
VCX (Voluntary ConteXt switches, rate at which a process or thread surrenders its CPU prior to consuming its time quantum)
ICX (Involuntary ConteXt switches, rate at which a process or thread has its CPU stolen by expiration of its time quantum or due to preemption by a higher-priority thread)
SCL (rate of system calls)
SIG (rate of signal events)
Indeed, prstat is one of the most useful tools for gaining insight into where the time is going at both the process and thread levels.
iostat(1M)
The iostat command is one of the most commonly used, and perhaps least well-understood, of the stat commands. Much of the data reported by iostat is also available from sar(1M) or kstat(1M), but the format of iostat output is the most convenient and widely used for monitoring or investigating disk I/O.
My own preferred iostat usage is usually iostat -xnzTd, where:
x (which returns extended device statistics)
n (which returns more useful device names)
z (which eliminates rows with all zeroes)
-Td (which timestamps each report interval)
Digesting reams of iostat data can be a wearisome task, but tools have been developed that post-process data in this format. One extremely useful tool is an awk(1) script called iobal (see Smith in the References section). This handy tool summarizes iostat data in terms of bandwidth (megabytes per second) or I/O per second, and optionally, it lists the top ten most active targets. Recent versions also do controller aggregation (as iostat does with the -C option); thus, you can easily assess controller utilization from iostat data that was not collected with the -C option.
truss(1)
The primary capability of truss(1) is observing all of the system calls emanating from a process. Over time, truss(1) has been enhanced to provide a great deal of other functionality. In some cases, the functionality overlaps the capabilities of sotruss(1) and apptrace(1), but truss(1) has the ability to attach to an already running process.
CAUTION
truss(1) is intrusive on the monitored process. Depending on the options, the intrusiveness can range from slight to severe.
Running truss -c -p pid against a process for a brief interval, then interrupting it (for example, with Ctrl+C), gives a concise summary of the frequency of systems calls and their average execution times. This can be quite handy for finding system-call hot spots, or on some occasions, bringing system-call implementation efficiency into question.
The -c, -d, and -D options are among the handiest features of truss(1). The ability of truss(1) to calculate timestamps with the -d option and time deltas with the -D option can be quite useful for finding out where the time is going. However, be aware that the time deltas are easily misinterpreted. It is easy to think that the time delta between two reported events represents the time required by the former event, but in fact, the time delta includes any and all compute time between reported events. In addition, because truss(1) has a significant probe effect inherent in the timing data, taking the timing data too literally would not be appropriate.
lockstat(1M)
lockstat(1M) is a very versatile tool for gaining significant insights into where the hots spots are in the system with respect to locking activity. Hot locks are a basic indicator of contention in a system. As a "starter incantation," one might try the following command:
$ lockstat -H -D64 sleep 10 > lockstat.out
This command tabulates the top 64 lock events in several categories over a period of 10 seconds. When using this command, you can safely ignore any error messages about "too many records." The right-hand column in the output consists of cryptic programming symbols. Although a programmer can extract more information from this data than a non-programmer, the symbols are usually named so that they give good clues as to where the system is busy.
The following list gives some examples:
ufs_ (the UFS file system)
va_ (virtual address primitives)
vx_ (something in the VERITAS software)
lwp_ (light-weight process [that is, thread] management)
cv_ (condition variables, synchronization primitives)
High lock dwell times or high lock iteration counts can have significant diagnostic power by indicating what parts of the system are most active.
The nanosecond values reported by lockstat(1M) can take some getting used to. A nanosecond is 10-9 seconds, so the first digit of a seven-digit nanosecond value is in milliseconds (10-3 seconds).
kstat(1M)
The kstat(1M) command reports on statistics that are maintained by the lightweight kstat facility within the Solaris OS kernel. The command interface is a bit awkward, and kstat would be much improved by the addition of error messages when incorrect arguments are used. Nevertheless, its ability to pull statistics by class or module is very versatile and efficient, and kstat) is now preferred over the deprecated netstat -k command.
Because the kstat facility keeps only counts, it is useful to employ tools on top of this interface to compute, log, and display rates. The most famously useful set of tools for this purpose is the SE Toolkit by Adrian Cockcroft and Richard Pettit, which is freely available from http://www.setoolkit.com.
trapstat(1M)
Introduced in the Solaris 9 8/03 OS release13, the trapstat(1M) command tabulates and formats information about time spent handling certain low-level events. For example, trapstat -t reports on the percentage of time spent in handling memory management tasks.
High-percentage times spent in memory management can indicate memory stride problems or opportunities for code optimization. Here again, the tool is largely in the domain of gurus, but casual users can extract some useful information about where the time is going.
busstat(1M), cpustat(1M), and cputrack(1M)
These tools enable specialists to investigate issues previously described as being esoteric. Productive use of these tools requires very specific low-level knowledge of system components. Over time, other tools should evolve to automate some of the analyses that are possible based on data from these tools.
Of particular interest are measurements of efficiency such as "cycles per instruction" (CPI), which characterizes low-level overall CPU efficiency. A free tool to derive CPI measures, called har, is available online from Sun's Market Development Engineering web site at http://www.sunmde.com/perf_tools/har.
Sun™ ONE Studio Developer Tools
Sun's programming development suite includes tools for performance analysis. Of particular interest is the ability to gather performance data from noninstrumented binaries (collect) and analyze the collected data either by command line (er_print) or GUI (analyzer). Much of the information on the use of these tools is available on the Internet. Search on the keywords "er_print collect analyzer" to find links to official sources, as well as a wide variety of how-to material.
The Sun ONE Studio developer tools are the tools of choice for identifying code optimization targets and for measuring the impact of incremental code refinements. It can be a very useful exercise, even for non-programmers, to get some firsthand experience on the capabilities of these tools.