Monitoring RAM and Swap
Examining how much memory is being used and how much is free has always been a source of confusion in the Solaris environment. Usually we can identify a memory shortfall by watching the system swap space usage. Remember that the Solaris operating system starts to use swap space when it runs out of physical memory, and we refer to this as paging. We can watch swap space usage by examining output from the vmstat command. Two indicators of a RAM shortage are the scan rate and swap device activity. Watch the 12th column (sr or scan rate) of information reported by vmstat in conjunction with I/O traffic displayed with the iostat Pxn command. With the iostat Pxn command, watch the swap partitions. The r/s and w/s columns might have high figures if a large amount of I/O is being generated through the file system and the page scanner needs to run to free up pages for I/O.
Here's output from vmstat:
kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr dd f0 s0 -- in sy cs us sy id 2 0 0 549048 3928 10 417 336 181 355 0 96 40 0 0 0 490 1209 594 64 12 24 0 0 0 542872 2736 31 420 475 333 680 0 159 50 0 0 0 510 1548 576 43 14 43 1 0 0 536488 1816 0 179 281 589 1258 0 287 38 0 0 0 490 1234 611 51 9 40 1 0 0 534360 1872 13 158 350 307 594 0 135 25 0 0 0 461 1586 592 75 10 15
Here's output from the iostat Pxn command:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.6 0.6 4.8 76.8 0.0 0.0 0.0 21.0 0 3 c0t0d0s0 0.2 0.2 0.2 0.4 0.0 0.0 0.0 22.7 0 1 c0t0d0s1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s4 9.8 0.0 152.0 0.0 0.0 0.1 0.1 15.2 0 13 c0t0d0s5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ultra5:vold(pid259) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.2 1.2 1.6 129.6 0.0 0.0 0.5 28.8 0 4 c0t0d0s0 0.4 0.2 3.2 1.6 0.0 0.0 0.0 20.2 0 1 c0t0d0s1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s4 27.2 1.0 443.6 8.0 0.0 0.5 0.8 17.1 2 38 c0t0d0s5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ultra5:vold(pid259) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.4 0.8 3.2 102.4 0.0 0.0 0.0 24.8 0 3 c0t0d0s0 0.0 1.8 0.0 14.4 0.8 0.0 451.1 18.2 11 2 c0t0d0s1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s2 0.0 0.4 0.0 3.2 0.2 0.0 570.0 7.4 11 0 c0t0d0s3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s4 67.2 16.2 782.8 129.6 7.3 0.9 87.3 10.2 17 65 c0t0d0s5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 ultra5:vold(pid259
With vmstat, use vmstat 30 to check memory usage every 30 seconds. Ignore the summary statistics on the first line. If page/sr exceeds 200 pages per second for an extended time, your system might be running short of physical memory. In the example I show, the system is running between 96 and 287 pages per second, which is indicating a potential memory shortfall.
Looking at the output from iostat Pxn, watch the row for the c0t0d0s0 (swap partition). If there are I/Os queued for the swap device (svc_t column), application paging is occurring. If there is significant, heavy I/O to the swap device, you might be experiencing a RAM shortage, and a RAM upgrade may be in order. Sometimes you can even hear the disk paging. How do you determine heavy swap I/O? As stated in the preceding section, any disk that is consistently more than 10% busy with svc_t above 30ms is getting heavy I/O. Also, compare the figures you have on a busy system against your baseline figures to determine if the disks are experiencing higher-than-average activity.
If physical memory is inadequate, the system will be so busy paging to the swap device that it will be unable to keep up with demand. We refer to this state as thrashing, and it is characterized by heavy I/O on the swap device and sluggish performance. In this state, the page scanner (the part of the kernel that handles swapping) can use up to 80% of CPU.
If you have a RAM shortage and you don't have enough swap space to handle the overload, new processes will be unable to open. You might even see one of the following errors display:
"Not enough space"
or
"WARNING: /tmp: File system full, swap space limit exceeded."