- 6.1 Overview
- 6.2 Monitoring the Processes
- 6.3 Controlling the Processes
- 6.4 Process Manager
- 6.5 Scheduling Processes
6.3 Controlling the Processes
Controlling the processes in Solaris includes clearing hung processes, terminating unwanted or misbehaving processes, changing the execution priority of a process, suspending a process, resuming a suspended process, and so on. Following are the different ways the process can be controlled in Solaris.
6.3.1 The nice and renice Commands
If you wish to run a CPU intensive process, then you must know about the nice value of a process and the nice command. The nice value of a process represents the priority of the process. Every process has a nice value in the range from 0 to 39, with 39 being the nicest. The higher the nice value, the lower the priority. By default, user processes start with a nice value of 20. You can see the current nice value of a process in the NI column of ps command listing.
The nice command can be used to alter the default priority of a process at the start time. Following is an example of how to start a process with lower priority:
# nice -n 5 proc_exp arg1 arg2 arg3
This command will start the process proc_exp with nice value 25, which will be higher than the nice value 20 of other running processes and hence proc_exp will have lower priority.
Following is an example to start a process with higher priority:
# nice -n -5 proc_exp arg1 arg2 arg3
This command will start the process proc_exp with nice value 15, which will be less than the nice value 20 of other running processes and hence proc_exp will have higher priority.
The renice command can be used to alter the nice value of running processes. If proc_exp having PID 1234 was started with its default nice value of 20, the following command will lower the priority of this process by increasing its nice value to 25.
# renice -n 5 1234
or
# renice -n 5 -p 1234
The following command will increase the priority of proc_exp by decreasing its nice value to 15.
# renice -n -5 1234
or
# renice -n -5 -p 1234
For more information, see the nice(1M) and renice(1M) man pages.
6.3.2 Signals
Solaris supports the concept of signals, which are software interrupts. Signals can be used for communication between processes. Signals can be synchronously generated by an error in an application, such as SIGFPE and SIGSEGV, but most of the signals are asynchronous. A signal notifies the receiving process about an event. The following are the different ways to send a signal to a process:
- When a user presses terminal keys, the terminal will generate a signal; for example, when the user breaks a program by pressing the CTRL + C key pair.
- Hardware exceptions can also generate signals; for example, division by 0 generates SIGFPE (Floating Point Error) signal and invalid memory reference generates the SIGSEGV (Segmentation Violation) signal.
- The operating system kernel can generate a signal to inform processes when something happens. For example, SIGPIPE (Pipe Error) signal will be generated when a process writes to a pipe that has been closed by the reader.
- Processes can send the signal to other processes by using the kill(2) system call. Every process can send a signal in its privilege limitations. To send a signal, its real or effective user id has to be matched with the receiver process. Superuser can send signals without any restrictions.
There is also a Solaris command called kill that can be used to send signals from the command line. To send a signal, your real or effective user id has to be matched with that of the receiver process.
Every signal has a unique signal name and a corresponding signal number. For every possible signal, the system defines a default disposition, or action to take when it occurs. There are four possible default dispositions:
- Ignore: Ignores the signal; no action taken
- Exit: Forces the process to exit
- Core: Forces the process to exit, and creates a core file
- Stop: Stops the process (pause a process)
Programmers can code their applications to respond in customized ways to most signals. These custom pieces of code are called signal handlers. For more information on signal handlers, see the signal(3) man page.
Two signals are unable to be redefined by a signal handler. They are SIGKILL and SIGSTOP. SIGKILL always forces the process to terminate (Exit) and SIG-STOP always pauses a running process (Stop). These two signals cannot be caught by a signal handler.
Several other key points about signals are listed below:
- When a signal occurs, it is said that the signal is generated.
- When an action is taken for a signal, this means the signal is delivered.
- If a signal is between generation and delivery, this means the signal is pending, as clearly shown in Figure 6.1.
Figure 6.1 Signal States
- It is possible to block a signal for a process. If the process does not ignore the blocked signal, then the signal will be pending.
- A blocked signal can be generated more than once before the process unblocks the signal. The kernel can deliver the signal once or more. If it delivers signals more than once, then the signal is queued. If the signals are delivered only once, then it is not queued. If multiple copies of a signal are delivered to a process while that signal is blocked, normally only a single copy of that signal will be delivered to the process when the signal becomes unblocked.
- Each process has a signal mask. Signal masks define blocked signals for a process. It is just a bit array which includes one bit for each signal. If the bit is on, then that means the related signal will be blocked.
Table 6.8 provides the list of the most common signals an administrator is likely to use, along with a description and default action.
Table 6.8. Solaris Signals
Name |
Number |
Default Action |
Description |
SIGHUP |
1 |
Exit |
Hangup. Usually means that the controlling terminal has been disconnected. |
SIGINT |
2 |
Exit |
Interrupt. User can generate this signal by pressing Ctrl+C. |
SIGQUIT |
3 |
Core |
Quits the process. User can generate this signal by pressing Ctrl+\. |
SIGILL |
4 |
Core |
Illegal instruction. |
SIGTRAP |
5 |
Core |
Trace or breakpoint trap. |
SIGABRT |
6 |
Core |
Abort. |
SIGEMT |
7 |
Core |
Emulation trap. |
SIGFPE |
8 |
Core |
Arithmetic exception. Informs the process of a floating point error like divide by zero. |
SIGKILL |
9 |
Exit |
Kill. Forces the process to terminate. This is a sure kill. (Cannot be caught, blocked, or ignored). |
SIGBUS |
10 |
Core |
Bus error. |
SIGSEGV |
11 |
Core |
Segmentation fault. Usually generated when process tries to access an illegal address. |
SIGSYS |
12 |
Core |
Bad system call. Usually generated when a bad argument is used in a system call. |
SIGPIPE |
13 |
Exit |
Broken pipe. Generated when a process writes to a pipe that has been closed by the reader. |
SIGALRM |
14 |
Exit |
Alarm clock. Generated by clock when alarm expires. |
SIGTERM |
15 |
Exit |
Terminated. A gentle kill that gives the receiving process a chance to clean up. |
SIGUSR1 |
16 |
Exit |
User defined signal 1. |
SIGUSR2 |
17 |
Exit |
User defined signal 2. |
SIGCHLD |
18 |
Ignore |
Child process status changed. For example, a child process has terminated or stopped. |
SIGPWR |
19 |
Ignore |
Power fail or restart. |
SIGWINCH |
20 |
Ignore |
Window size change. |
SIGURG |
21 |
Ignore |
Urgent socket condition. |
SIGPOLL |
22 |
Exit |
Pollable event occurred or Socket I/O possible. |
SIGSTOP |
23 |
Stop |
Stop. Pauses a process. (Cannot be caught, blocked, or ignored). |
SIGTSTP |
24 |
Stop |
Stop requested by user. User can generate this signal by pressing Ctrl+Z. |
SIGCONT |
25 |
Ignore |
Continued. Stopped process has been continued. |
SIGTTIN |
26 |
Stop |
Stopped—tty input. |
SIGTTOU |
27 |
Stop |
Stopped—tty output. |
SIGVTALRM |
28 |
Exit |
Virtual timer expired. |
SIGPROF |
29 |
Exit |
Profiling timer expired. |
SIGXCPU |
30 |
Core |
CPU time limit exceeded. |
SIGXFSZ |
31 |
Core |
File size limit exceeded. |
SIGWAITING |
32 |
Ignore |
Concurrency signal used by threads library. |
SIGLWP |
33 |
Ignore |
Inter-LWP (Light Weight Processes) signal used by threads library. |
SIGFREEZE |
34 |
Ignore |
Checkpoint suspend. |
SIGTHAW |
35 |
Ignore |
Checkpoint resume. |
SIGCANCEL |
36 |
Ignore |
Cancellation signal used by threads library. |
SIGLOST |
37 |
Ignore |
Resource lost. |
SIGRTMIN |
38 |
Exit |
Highest priority real time signal. |
SIGRTMAX |
45 |
Exit |
Lowest priority real time signal. |
Sometimes you might need to terminate or stop a process. For example, a process might be in an endless loop, it might be hung, or you might have started a long process that you want to stop before it has completed. You can send a signal to any such process by using the previously mentioned kill command, which has the following syntax:
kill [ - <signal>] <pid>
The <pid> is the process ID of the process for which the signal has to be sent, and <signal> is the signal number for any of the signal from Table 6.8. If you do not specify any value for <signal>, then by default, 15 (SIGTERM) is used as the signal number. If you use 9 (SIGKILL) for the <signal>, then the process terminates promptly.
However, be cautious when using signal number 9 to kill a process. It terminates the receiving process immediately. If the process is in middle of some critical operation, it might result in data corruption.
For example, if you kill a database process or an LDAP server process using signal number 9, then you might lose or corrupt data contained in the database. A good policy is to first always use the kill command without specifying any signal and wait for a few minutes to see whether the process terminates gently before you issue the kill command with -9 signal. Using the kill command without specifying the signal number sends SIGTERM (15) signal to the process with PID as <pid> and thus the receiving process does the clean up job before terminating and does not result in data corruption.
As described earlier, the ps or pgrep command can be used to get the PID of any process in the system. In order to send SIGSTOP signal to process proc_exp, first you can determine the PID of proc_exp using the pgrep command as follows:
# pgrep proc_exp 1234 |
Now you can pass this PID to the kill command with signal number 23 (SIGSTOP) as follows:
# kill -23 1234
This will result in getting the process proc_exp paused.
There is another interesting Solaris command, pkill, which can be used to replace the pgrep and kill command combination. The pkill command works the same way as the kill command, but the only difference is, the pkill command accepts the process name as the last argument instead of PID. The syntax of the pkill command is as follows:
pkill [ - <signal>] <process name>
The <process name> is the name of the process (command name) to which the signal has to be sent. You can use a single pkill command to send SIGSTOP signal to process proc_exp as follows:
# pkill -23 proc_name
For more information, see the kill(1M) and pkill(1M) man pages.