11.5. Thread Termination
If any thread within a process calls exit, _Exit, or _exit, then the entire process terminates. Similarly, when the default action is to terminate the process, a signal sent to a thread will terminate the entire process (we’ll talk more about the interactions between signals and threads in Section 12.8).
A single thread can exit in three ways, thereby stopping its flow of control, without terminating the entire process.
- The thread can simply return from the start routine. The return value is the thread’s exit code.
- The thread can be canceled by another thread in the same process.
- The thread can call pthread_exit.
The rval_ptr argument is a typeless pointer, similar to the single argument passed to the start routine. This pointer is available to other threads in the process by calling the pthread_join function.
The calling thread will block until the specified thread calls pthread_exit, returns from its start routine, or is canceled. If the thread simply returned from its start routine, rval_ptr will contain the return code. If the thread was canceled, the memory location specified by rval_ptr is set to PTHREAD_CANCELED.
By calling pthread_join, we automatically place the thread with which we’re joining in the detached state (discussed shortly) so that its resources can be recovered. If the thread was already in the detached state, pthread_join can fail, returning EINVAL, although this behavior is implementation-specific.
If we’re not interested in a thread’s return value, we can set rval_ptr to NULL. In this case, calling pthread_join allows us to wait for the specified thread, but does not retrieve the thread’s termination status.
Example
Figure 11.3 shows how to fetch the exit code from a thread that has terminated.
Figure 11.3. Fetching the thread exit status
#include ″apue.h″ #include <pthread.h> void * thr_fn1(void *arg) { printf(″thread 1 returning\n″); return((void *)1); } void * thr_fn2(void *arg) { printf(″thread 2 exiting\n″); pthread_exit((void *)2); } int main(void) { int err; pthread_t tid1, tid2; void *tret; err = pthread_create(&tid1, NULL, thr_fn1, NULL); if (err != 0) err_exit(err, ″can′t create thread 1″); err = pthread_create(&tid2, NULL, thr_fn2, NULL); if (err != 0) err_exit(err, ″can′t create thread 2″); err = pthread_join(tid1, &tret); if (err != 0) err_exit(err, ″can′t join with thread 1″); printf(″thread 1 exit code %ld\n″, (long)tret); err = pthread_join(tid2, &tret); if (err != 0) err_exit(err, ″can′t join with thread 2″); printf(″thread 2 exit code %ld\n″, (long)tret); exit(0); }
Running the program in Figure 11.3 gives us
$ ./a.out thread 1 returning thread 2 exiting thread 1 exit code 1 thread 2 exit code 2
As we can see, when a thread exits by calling pthread_exit or by simply returning from the start routine, the exit status can be obtained by another thread by calling pthread_join.
The typeless pointer passed to pthread_create and pthread_exit can be used to pass more than a single value. The pointer can be used to pass the address of a structure containing more complex information. Be careful that the memory used for the structure is still valid when the caller has completed. If the structure was allocated on the caller’s stack, for example, the memory contents might have changed by the time the structure is used. If a thread allocates a structure on its stack and passes a pointer to this structure to pthread_exit, then the stack might be destroyed and its memory reused for something else by the time the caller of pthread_join tries to use it.
Example
The program in Figure 11.4 shows the problem with using an automatic variable (allocated on the stack) as the argument to pthread_exit.
Figure 11.4.Incorrect use of pthread_exit argument
#include ″apue.h″ #include <pthread.h> struct foo { int a, b, c, d; }; void printfoo(const char *s, const struct foo *fp) { printf(″%s″, s); printf(″ structure at 0x%lx\n″, (unsigned long)fp); printf(″ foo.a = %d\n″, fp->a); printf(″ foo.b = %d\n″, fp->b); printf(″ foo.c = %d\n″, fp->c); printf(″ foo.d = %d\n″, fp->d); } void * thr_fn1(void *arg) { struct foo foo = {1, 2, 3, 4}; printfoo(″thread 1:\n″, &foo); pthread_exit((void *)&foo); } void * thr_fn2(void *arg) { printf(″thread 2: ID is %lu\n″, (unsigned long)pthread_self()); pthread_exit((void *)0); } int main(void) { int err; pthread_t tid1, tid2; struct foo *fp; err = pthread_create(&tid1, NULL, thr_fn1, NULL); if (err != 0) err_exit(err, ″can′t create thread 1″); err = pthread_join(tid1, (void *)&fp); if (err != 0) err_exit(err, ″can′t join with thread 1″); sleep(1); printf(″parent starting second thread\n″); err = pthread_create(&tid2, NULL, thr_fn2, NULL); if (err != 0) err_exit(err, ″can′t create thread 2″); sleep(1); printfoo(″parent:\n″, fp); exit(0); }
When we run this program on Linux, we get
$ ./a.out thread 1: structure at 0x7f2c83682ed0 foo.a = 1 foo.b = 2 foo.c = 3 foo.d = 4 parent starting second thread thread 2: ID is 139829159933696 parent: structure at 0x7f2c83682ed0 foo.a = -2090321472 foo.b = 32556 foo.c = 1 foo.d = 0
Of course, the results vary, depending on the memory architecture, the compiler, and the implementation of the threads library. The results on Solaris are similar:
$ ./a.out thread 1: structure at 0xffffffff7f0fbf30 foo.a = 1 foo.b = 2 foo.c = 3 foo.d = 4 parent starting second thread thread 2: ID is 3 parent: structure at 0xffffffff7f0fbf30 foo.a = -1 foo.b = 2136969048 foo.c = -1 foo.d = 2138049024
As we can see, the contents of the structure (allocated on the stack of thread tid1) have changed by the time the main thread can access the structure. Note how the stack of the second thread (tid2) has overwritten the first thread’s stack. To solve this problem, we can either use a global structure or allocate the structure using malloc.
On Mac OS X, we get different results:
$ ./a.out thread 1: structure at 0x1000b6f00 foo.a = 1 foo.b = 2 foo.c = 3 foo.d = 4 parent starting second thread thread 2: ID is 4295716864 parent: structure at 0x1000b6f00 Segmentation fault (core dumped)
In this case, the memory is no longer valid when the parent tries to access the structure passed to it by the first thread that exited, and the parent is sent the SIGSEGV signal.
On FreeBSD, the memory hasn’t been overwritten by the time the parent accesses it, and we get
thread 1: structure at 0xbf9fef88 foo.a = 1 foo.b = 2 foo.c = 3 foo.d = 4 parent starting second thread thread 2: ID is 673279680 parent: structure at 0xbf9fef88 foo.a = 1 foo.b = 2 foo.c = 3 foo.d = 4
Even though the memory is still intact after the thread exits, we can’t depend on this always being the case. It certainly isn’t what we observe on the other platforms.
One thread can request that another in the same process be canceled by calling the pthread_cancel function.
In the default circumstances, pthread_cancel will cause the thread specified by tid to behave as if it had called pthread_exit with an argument of PTHREAD_CANCELED. However, a thread can elect to ignore or otherwise control how it is canceled. We will discuss this in detail in Section 12.7. Note that pthread_cancel doesn’t wait for the thread to terminate; it merely makes the request.
A thread can arrange for functions to be called when it exits, similar to the way that the atexit function (Section 7.3) can be used by a process to arrange that functions are to be called when the process exits. The functions are known as thread cleanup handlers. More than one cleanup handler can be established for a thread. The handlers are recorded in a stack, which means that they are executed in the reverse order from that with which they were registered.
The pthread_cleanup_push function schedules the cleanup function, rtn, to be called with the single argument, arg, when the thread performs one of the following actions:
- Makes a call to pthread_exit
- Responds to a cancellation request
- Makes a call to pthread_cleanup_pop with a nonzero execute argument
If the execute argument is set to zero, the cleanup function is not called. In either case, pthread_cleanup_pop removes the cleanup handler established by the last call to pthread_cleanup_push.
A restriction with these functions is that, because they can be implemented as macros, they must be used in matched pairs within the same scope in a thread. The macro definition of pthread_cleanup_push can include a { character, in which case the matching } character is in the pthread_cleanup_pop definition.
Example
Figure 11.5 shows how to use thread cleanup handlers. Although the example is somewhat contrived, it illustrates the mechanics involved. Note that although we never intend to pass zero as an argument to the thread start-up routines, we still need to match calls to pthread_cleanup_pop with the calls to pthread_cleanup_push; otherwise, the program might not compile.
Figure 11.5. Thread cleanup handler
#include ″apue.h″ #include <pthread.h> void cleanup(void *arg) { printf(″cleanup: %s\n″, (char *)arg); } void * thr_fn1(void *arg) { printf(″thread 1 start\n″); pthread_cleanup_push(cleanup, ″thread 1 first handler″); pthread_cleanup_push(cleanup, ″thread 1 second handler″); printf(″thread 1 push complete\n″); if (arg) return((void *)1); pthread_cleanup_pop(0); pthread_cleanup_pop(0); return((void *)1); } void * thr_fn2(void *arg) { printf(″thread 2 start\n″); pthread_cleanup_push(cleanup, ″thread 2 first handler″); pthread_cleanup_push(cleanup, ″thread 2 second handler″); printf(″thread 2 push complete\n″); if (arg) pthread_exit((void *)2); pthread_cleanup_pop(0); pthread_cleanup_pop(0); pthread_exit((void *)2); } int main(void) { int err; pthread_t tid1, tid2; void *tret; err = pthread_create(&tid1, NULL, thr_fn1, (void *)1); if (err != 0) err_exit(err, ″can′t create thread 1″); err = pthread_create(&tid2, NULL, thr_fn2, (void *)1); if (err != 0) err_exit(err, ″can′t create thread 2″); err = pthread_join(tid1, &tret); if (err != 0) err_exit(err, ″can′t join with thread 1″); printf(″thread 1 exit code %ld\n″, (long)tret); err = pthread_join(tid2, &tret); if (err != 0) err_exit(err, ″can′t join with thread 2″); printf(″thread 2 exit code %ld\n″, (long)tret); exit(0); }
Running the program in Figure 11.5 on Linux or Solaris gives us
$ ./a.out thread 1 start thread 1 push complete thread 2 start thread 2 push complete cleanup: thread 2 second handler cleanup: thread 2 first handler thread 1 exit code 1 thread 2 exit code 2
From the output, we can see that both threads start properly and exit, but that only the second thread’s cleanup handlers are called. Thus, if the thread terminates by returning from its start routine, its cleanup handlers are not called, although this behavior varies among implementations. Also note that the cleanup handlers are called in the reverse order from which they were installed.
If we run the same program on FreeBSD or Mac OS X, we see that the program incurs a segmentation violation and drops core. This happens because on these systems, pthread_cleanup_push is implemented as a macro that stores some context on the stack. When thread 1 returns in between the call to pthread_cleanup_push and the call to pthread_cleanup_pop, the stack is overwritten and these platforms try to use this (now corrupted) context when they invoke the cleanup handlers. In the Single UNIX Specification, returning while in between a matched pair of calls to pthread_cleanup_push and pthread_cleanup_pop results in undefined behavior. The only portable way to return in between these two functions is to call pthread_exit.
By now, you should begin to see similarities between the thread functions and the process functions. Figure 11.6 summarizes the similar functions.
Process primitive |
Thread primitive |
Description |
fork |
pthread_create |
create a new flow of control |
exit |
pthread_exit |
exit from an existing flow of control |
waitpid |
pthread_join |
get exit status from flow of control |
atexit |
pthread_cleanup_push |
register function to be called at exit from flow of control |
getpid |
pthread_self |
get ID for flow of control |
abort |
pthread_cancel |
request abnormal termination of flow of control |
Figure 11.6. Comparison of process and thread primitives
By default, a thread’s termination status is retained until we call pthread_join for that thread. A thread’s underlying storage can be reclaimed immediately on termination if the thread has been detached. After a thread is detached, we can’t use the pthread_join function to wait for its termination status, because calling pthread_join for a detached thread results in undefined behavior. We can detach a thread by calling pthread_detach.
As we will see in the next chapter, we can create a thread that is already in the detached state by modifying the thread attributes we pass to pthread_create.