Process Wrap Up
In this chapter, we looked at the famed operating system abstraction of the process. We discussed the generalities of the process, why it is important, and the relationship between processes and threads. We then discussed how Linux stores and represents processes (with task_struct and thread_info), how processes are created (via clone() and fork()), how new executable images are loaded into address spaces (via the exec() family of system calls), the hierarchy of processes, how parents glean information about their deceased children (via the wait() family of system calls), and how processes ultimately die (forcefully or intentionally via exit()).
The process is a fundamental and crucial abstraction, at the heart of every modern operating system, and ultimately the reason we have operating systems altogether (to run programs).
The next chapter discusses process scheduling, which is the delicate and interesting manner in which the kernel decides which processes to run, at what time, and in what order.
Footnotes
The kernel implements the wait4() system call. Linux systems, via the C library, typically provide the wait(), waitpid(), wait3(), and wait4() functions. All these functions return status about a terminated process, albeit with slightly different semantics.
Some texts on operating system design call this list the task array. Because the Linux implementation is a linked list and not a static array, it is called the task list.
Register-impaired architectures were not the only reason for creating struct thread_info.
An opaque type is a data type whose physical representation is unknown or irrelevant.
This is why you have those dreaded unkillable processes with state D in ps(1). Because the task will not respond to signals, you cannot send it a SIGKILL signal. Further, even if you could terminate the task, it would not be wise as the task is supposedly in the middle of an important operation and may hold a semaphore.
Other than process context there is interrupt context, which we discuss in Chapter 6, "Interrupts and Interrupt Handlers." In interrupt context, the system is not running on behalf of a process, but is executing an interrupt handler. There is no process tied to interrupt handlers and consequently no process context.
By exec() I mean any member of the exec() family of functions. The kernel implements the execve() system call on top of which execlp(), execle(), execv(), and execvp() are implemented.
Amusingly, this does not currently function correctly, although the goal is for the child to run first.
In fact, there are currently patches to add this functionality to Linux. In time, this feature will most likely find its way into the mainline Linux kernel.
As an example, benchmark process creation time in Linux versus process (or even thread!) creation time in these other operating systems. The results are quite nice.