- 4.1. Introduction/Motivation
- 4.2. Threads and Processes
- 4.3. Threads and Python
- 4.4. The thread Module
- 4.5. The threading Module
- 4.6. Comparing Single vs. Multithreaded Execution
- 4.7. Multithreading in Practice
- 4.8. Producer-Consumer Problem and the Queue/queue Module
- 4.9. Alternative Considerations to Threads
- 4.10. Related Modules
- 4.11. Exercises
4.5. The threading Module
We will now introduce the higher-level threading module, which gives you not only a Thread class but also a wide variety of synchronization mechanisms to use to your heart’s content. Table 4-2 presents a list of all the objects available in the threading module.
Table 4-2. threading Module Objects
Object |
Description |
Thread |
Object that represents a single thread of execution |
Lock |
Primitive lock object (same lock as in thread module) |
RLock |
Re-entrant lock object provides ability for a single thread to (re)acquire an already-held lock (recursive locking) |
Condition |
Condition variable object causes one thread to wait until a certain “condition” has been satisfied by another thread, such as changing of state or of some data value |
Event |
General version of condition variables, whereby any number of threads are waiting for some event to occur and all will awaken when the event happens |
Semaphore |
Provides a “counter” of finite resources shared between threads; block when none are available |
BoundedSemaphore |
Similar to a Semaphore but ensures that it never exceeds its initial value |
Timer |
Similar to Thread, except that it waits for an allotted period of time before running |
Barriera |
Creates a “barrier,” at which a specified number of threads must all arrive before they’re all allowed to continue |
In this section, we will examine how to use the Thread class to implement threading. Because we have already covered the basics of locking, we will not cover the locking primitives here. The Thread() class also contains a form of synchronization, so explicit use of locking primitives is not necessary.
4.5.1. The Thread Class
The Thread class of the threading module is your primary executive object. It has a variety of functions not available to the thread module. Table 4-3 presents a list of attributes and methods.
Table 4-3. Thread Object Attributes and Methods
Attribute |
Description |
Thread object data attributes |
|
name |
The name of a thread. |
ident |
The identifier of a thread. |
daemon |
Boolean flag indicating whether a thread is daemonic. |
Thread object methods |
|
__init__(group=None, target=None, name=None, args=(), kwargs={}, verbose=None, daemon=None)c |
Instantiate a Thread object, taking target callable and any args or kwargs. A name or group can also be passed but the latter is unimplemented. A verbose flag is also accepted. Any daemon value sets the thread.daemon attribute/flag. |
start() |
Begin thread execution. |
run() |
Method defining thread functionality (usually overridden by application writer in a subclass). |
join(timeout=None) |
Suspend until the started thread terminates; blocks unless timeout (in seconds) is given. |
getName()a |
Return name of thread. |
setName(name)a |
Set name of thread. |
isAlive/is_alive()b |
Boolean flag indicating whether thread is still running. |
isDaemon()c |
Return True if thread daemonic, False otherwise. |
setDaemon(daemonic)c |
Set the daemon flag to the given Boolean daemonic value (must be called before thread start(). |
There are a variety of ways by which you can create threads using the Thread class. We cover three of them here, all quite similar. Pick the one you feel most comfortable with, not to mention the most appropriate for your application and future scalability (we like the final choice the best):
- Create Thread instance, passing in function
- Create Thread instance, passing in callable class instance
- Subclass Thread and create subclass instance
You’ll discover that you will pick either the first or third option. The latter is chosen when a more object-oriented interface is desired and the former, otherwise. The second, honestly, is a bit more awkward and slightly harder to read, as you’ll discover.
Create Thread Instance, Passing in Function
In our first example, we will just instantiate Thread, passing in our function (and its arguments) in a manner similar to our previous examples. This function is what will be executed when we direct the thread to begin execution. Taking our mtsleepB.py script from Example 4-3 and tweaking it by adding the use of Thread objects, we have mtsleepC.py, as shown in Example 4-4.
Example 4-4. Using the threading Module (mtsleepC.py)
The Thread class from the threading module has a join() method that lets the main thread wait for thread completion.
1 #!/usr/bin/env python 2 3 import threading 4 from time import sleep, ctime 5 6 loops = [4,2] 7 8 def loop(nloop, nsec): 9 print 'start loop', nloop, 'at:', ctime() 10 sleep(nsec) 11 print 'loop', nloop, 'done at:', ctime() 12 13 def main(): 14 print 'starting at:', ctime() 15 threads = [] 16 nloops = range(len(loops)) 17 18 for i in nloops: 19 t = threading.Thread(target=loop, 20 args=(i, loops[i])) 21 threads.append(t) 22 23 for i in nloops: # start threads 24 threads[i].start() 25 26 for i in nloops: # wait for all 27 threads[i].join() # threads to finish 28 29 print 'all DONE at:', ctime() 30 31 if __name__ == '__main__': 32 main()
When we run the script in Example 4-4, we see output similar to that of its predecessors:
$ mtsleepC.py starting at: Sun Aug 13 18:16:38 2006 start loop 0 at: Sun Aug 13 18:16:38 2006 start loop 1 at: Sun Aug 13 18:16:38 2006 loop 1 done at: Sun Aug 13 18:16:40 2006 loop 0 done at: Sun Aug 13 18:16:42 2006 all DONE at: Sun Aug 13 18:16:42 2006
So what did change? Gone are the locks that we had to implement when using the thread module. Instead, we create a set of Thread objects. When each Thread is instantiated, we dutifully pass in the function (target) and arguments (args) and receive a Thread instance in return. The biggest difference between instantiating Thread (calling Thread()) and invoking thread.start_new_thread() is that the new thread does not begin execution right away. This is a useful synchronization feature, especially when you don’t want the threads to start immediately.
Once all the threads have been allocated, we let them go off to the races by invoking each thread’s start() method, but not a moment before that. And rather than having to manage a set of locks (allocating, acquiring, releasing, checking lock state, etc.), we simply call the join() method for each thread. join() will wait until a thread terminates, or, if provided, a timeout occurs. Use of join() appears much cleaner than an infinite loop that waits for locks to be released (which is why these locks are sometimes known as spin locks).
One other important aspect of join() is that it does not need to be called at all. Once threads are started, they will execute until their given function completes, at which point, they will exit. If your main thread has things to do other than wait for threads to complete (such as other processing or waiting for new client requests), it should do so. join() is useful only when you want to wait for thread completion.
Create Thread Instance, Passing in Callable Class Instance
A similar offshoot to passing in a function when creating a thread is having a callable class and passing in an instance for execution—this is the more object-oriented approach to MT programming. Such a callable class embodies an execution environment that is much more flexible than a function or choosing from a set of functions. You now have the power of a class object behind you, as opposed to a single function or a list/tuple of functions.
Adding our new class ThreadFunc to the code and making other slight modifications to mtsleepC.py, we get mtsleepD.py, shown in Example 4-5.
Example 4-5. Using Callable Classes (mtsleepD.py)
In this example, we pass in a callable class (instance) as opposed to just a function. It presents more of an object-oriented approach than mtsleepC.py.
1 #!/usr/bin/env python 2 3 import threading 4 from time import sleep, ctime 5 6 loops = [4,2] 7 8 class ThreadFunc(object): 9 10 def __init__(self, func, args, name=''): 11 self.name = name 12 self.func = func 13 self.args = args 14 15 def __call__(self): 16 self.func(*self.args) 17 18 def loop(nloop, nsec): 19 print 'start loop', nloop, 'at:', ctime() 20 sleep(nsec) 21 print 'loop', nloop, 'done at:', ctime() 22 23 def main(): 24 print 'starting at:', ctime() 25 threads = [] 26 nloops = range(len(loops)) 27 28 for i in nloops: # create all threads 29 t = threading.Thread( 30 target=ThreadFunc(loop, (i, loops[i]), 31 loop.__name__)) 32 threads.append(t) 33 34 for i in nloops: # start all threads 35 threads[i].start() 36 37 for i in nloops: # wait for completion 38 threads[i].join() 39 40 print 'all DONE at:', ctime() 41 42 if __name__ == '__main__': 43 main()
When we run mtsleepD.py, we get the expected output:
$ mtsleepD.py starting at: Sun Aug 13 18:49:17 2006 start loop 0 at: Sun Aug 13 18:49:17 2006 start loop 1 at: Sun Aug 13 18:49:17 2006 loop 1 done at: Sun Aug 13 18:49:19 2006 loop 0 done at: Sun Aug 13 18:49:21 2006 all DONE at: Sun Aug 13 18:49:21 2006
So what are the changes this time? The addition of the ThreadFunc class and a minor change to instantiate the Thread object, which also instantiates ThreadFunc, our callable class. In effect, we have a double instantiation going on here. Let’s take a closer look at our ThreadFunc class.
We want to make this class general enough to use with functions other than our loop() function, so we added some new infrastructure, such as having this class hold the arguments for the function, the function itself, and also a function name string. The constructor __init__() just sets all the values.
When the Thread code calls our ThreadFunc object because a new thread is created, it will invoke the __call__() special method. Because we already have our set of arguments, we do not need to pass it to the Thread() constructor and can call the function directly.
Subclass Thread and Create Subclass Instance
The final introductory example involves subclassing Thread(), which turns out to be extremely similar to creating a callable class as in the previous example. Subclassing is a bit easier to read when you are creating your threads (lines 29–30). We will present the code for mtsleepE.py in Example 4-6 as well as the output obtained from its execution, and leave it as an exercise for you to compare mtsleepE.py to mtsleepD.py.
Example 4-6. Subclassing Thread (mtsleepE.py)
Rather than instantiating the Thread class, we subclass it. This gives us more flexibility in customizing our threading objects and simplifies the thread creation call.
1 #!/usr/bin/env python 2 3 import threading 4 from time import sleep, ctime 5 6 loops = (4, 2) 7 8 class MyThread(threading.Thread): 9 def __init__(self, func, args, name=''): 10 threading.Thread.__init__(self) 11 self.name = name 12 self.func = func 13 self.args = args 14 15 def run(self): 16 self.func(*self.args) 17 18 def loop(nloop, nsec): 19 print 'start loop', nloop, 'at:', ctime() 20 sleep(nsec) 21 print 'loop', nloop, 'done at:', ctime() 22 23 def main(): 24 print 'starting at:', ctime() 25 threads = [] 26 nloops = range(len(loops)) 27 28 for i in nloops: 29 t = MyThread(loop, (i, loops[i]), 30 loop.__name__) 31 threads.append(t) 32 33 for i in nloops: 34 threads[i].start() 35 36 for i in nloops: 37 threads[i].join() 38 39 print 'all DONE at:', ctime()' 40 41 if __name__ == '__main__': 42 main()
Here is the output for mtsleepE.py. Again, it’s just as we expected:
$ mtsleepE.py starting at: Sun Aug 13 19:14:26 2006 start loop 0 at: Sun Aug 13 19:14:26 2006 start loop 1 at: Sun Aug 13 19:14:26 2006 loop 1 done at: Sun Aug 13 19:14:28 2006 loop 0 done at: Sun Aug 13 19:14:30 2006 all DONE at: Sun Aug 13 19:14:30 2006
While you compare the source between the mtsleep4 and mtsleep5 modules, we want to point out the most significant changes: 1) our MyThread subclass constructor must first invoke the base class constructor (line 9), and 2) the former special method __call__() must be called run() in the subclass.
We now modify our MyThread class with some diagnostic output and store it in a separate module called myThread (look ahead to Example 4-7) and import this class for the upcoming examples. Rather than simply calling our functions, we also save the result to instance attribute self.res, and create a new method to retrieve that value, getResult().
Example 4-7. MyThread Subclass of Thread (myThread.py)
To generalize our subclass of Thread from mtsleepE.py, we move the subclass to a separate module and add a getResult() method for callables that produce return values.
1 #!/usr/bin/env python 2 3 import threading 4 from time import ctime 5 6 class MyThread(threading.Thread): 7 def __init__(self, func, args, name=''): 8 threading.Thread.__init__(self) 9 self.name = name 10 self.func = func 11 self.args = args 12 13 def getResult(self): 14 return self.res 15 16 def run(self): 17 print 'starting', self.name, 'at:', 18 ctime() 19 self.res = self.func(*self.args) 20 print self.name, 'finished at:', 21 ctime()
4.5.2. Other Threading Module Functions
In addition to the various synchronization and threading objects, the Threading module also has some supporting functions, as detailed in Table 4-4.
Table 4-4. threading Module Functions
Function |
Description |
activeCount/active_count()a |
Number of currently active Thread objects |
currentThread()/current_threada |
Returns the current Thread object |
enumerate() |
Returns list of all currently active Threads |
settrace(func)b |
Sets a trace function for all threads |
setprofile(func)b |
Sets a profile function for all threads |
stack_size(size=0)c |
Returns stack size of newly created threads; optional size can be set for subsequently created threads |