11.4. Thread Creation
The traditional UNIX process model supports only one thread of control per process. Conceptually, this is the same as a threads-based model whereby each process is made up of only one thread. With pthreads, when a program runs, it also starts out as a single process with a single thread of control. As the program runs, its behavior should be indistinguishable from the traditional process, until it creates more threads of control. Additional threads can be created by calling the pthread_create function.
The memory location pointed to by tidp is set to the thread ID of the newly created thread when pthread_create returns successfully. The attr argument is used to customize various thread attributes. We’ll cover thread attributes in Section 12.3, but for now, we’ll set this to NULL to create a thread with the default attributes.
The newly created thread starts running at the address of the start_rtn function. This function takes a single argument, arg, which is a typeless pointer. If you need to pass more than one argument to the start_rtn function, then you need to store them in a structure and pass the address of the structure in arg.
When a thread is created, there is no guarantee which will run first: the newly created thread or the calling thread. The newly created thread has access to the process address space and inherits the calling thread’s floating-point environment and signal mask; however, the set of pending signals for the thread is cleared.
Note that the pthread functions usually return an error code when they fail. They don’t set errno like the other POSIX functions. The per-thread copy of errno is provided only for compatibility with existing functions that use it. With threads, it is cleaner to return the error code from the function, thereby restricting the scope of the error to the function that caused it, instead of relying on some global state that is changed as a side effect of the function.
Example
Although there is no portable way to print the thread ID, we can write a small test program that does, to gain some insight into how threads work. The program in Figure 11.2 creates one thread and prints the process and thread IDs of the new thread and the initial thread.
Figure 11.2. Printing thread IDs
#include ″apue.h″ #include <pthread.h> pthread_t ntid; void printids(const char *s) { pid_t pid; pthread_t tid; pid = getpid(); tid = pthread_self(); printf(″%s pid %lu tid %lu (0x%lx)\n″, s, (unsigned long)pid, (unsigned long)tid, (unsigned long)tid); } void * thr_fn(void *arg) { printids(″new thread: ″); return((void *)0); } int main(void) { int err; err = pthread_create(&ntid, NULL, thr_fn, NULL); if (err != 0) err_exit(err, ″can′t create thread″); printids(″main thread:″); sleep(1); exit(0); }
This example has two oddities, which are necessary to handle races between the main thread and the new thread. (We’ll learn better ways to deal with these conditions later in this chapter.) The first is the need to sleep in the main thread. If it doesn’t sleep, the main thread might exit, thereby terminating the entire process before the new thread gets a chance to run. This behavior is dependent on the operating system’s threads implementation and scheduling algorithms.
The second oddity is that the new thread obtains its thread ID by calling pthread_self instead of reading it out of shared memory or receiving it as an argument to its thread-start routine. Recall that pthread_create will return the thread ID of the newly created thread through the first parameter (tidp). In our example, the main thread stores this ID in ntid, but the new thread can’t safely use it. If the new thread runs before the main thread returns from calling pthread_create, then the new thread will see the uninitialized contents of ntid instead of the thread ID.
Running the program in Figure 11.2 on Solaris gives us
$ ./a.out main thread: pid 20075 tid 1 (0x1) new thread: pid 20075 tid 2 (0x2)
As we expect, both threads have the same process ID, but different thread IDs. Running the program in Figure 11.2 on FreeBSD gives us
$ ./a.out main thread: pid 37396 tid 673190208 (0x28201140) new thread: pid 37396 tid 673280320 (0x28217140)
As we expect, both threads have the same process ID. If we look at the thread IDs as decimal integers, the values look strange, but if we look at them in hexadecimal format, they make more sense. As we noted earlier, FreeBSD uses a pointer to the thread data structure for its thread ID.
We would expect Mac OS X to be similar to FreeBSD; however, the thread ID for the main thread is from a different address range than the thread IDs for threads created with pthread_create:
$ ./a.out main thread: pid 31807 tid 140735073889440 (0x7fff70162ca0) new thread: pid 31807 tid 4295716864 (0x1000b7000)
Running the same program on Linux gives us
$ ./a.out main thread: pid 17874 tid 140693894424320 (0x7ff5d9996700) new thread: pid 17874 tid 140693886129920 (0x7ff5d91ad700)
The Linux thread IDs look like pointers, even though they are represented as unsigned long integers.
- The threads implementation changed between Linux 2.4 and Linux 2.6. In Linux 2.4, LinuxThreads implemented each thread with a separate process. This made it difficult to match the behavior of POSIX threads. In Linux 2.6, the Linux kernel and threads library were overhauled to use a new threads implementation called the Native POSIX Thread Library (NPTL). This supported a model of multiple threads within a single process and made it easier to support POSIX threads semantics.