Concurrent Programming in Java: Creating Threads
new Thread(aRunnable).start();
Is this a fancy way to invoke a method (i.e., a Runnable's run method), or is it a way to create a fancy object (i.e., a new instance of class Thread)? Clearly it is both, but focusing on one aspect versus the other leads to two approaches to using threads that were implicit in discussions in Chapter 1:
Task-based. Here, the main reason to use a thread is to asynchronously invoke a method that performs some task. The task might range from a single method to an entire session. Thread-based techniques can support message-passing schemes that escape the limitations of pure procedural calls. Task-based designs are seen in event frameworks, parallel computation, and IO-intensive systems.
Actor-based. Here, the main reason to use a thread is to create and set into motion a new autonomous, active, process-like object. This object may in turn react to external events, interact with other actors, and so on. Actor-based designs are seen in reactive, control, and distributed systems. They are also the focus of most formal approaches to concurrency.
(Both the terms task and actor have many overloaded meanings and near-synonyms. We'll confine usage to the above senses.)
In task-based systems, passive objects sometimes send active (thread-propelled) messages, while in actor-based systems, active objects normally send passive messages. As is usually the case for artificial dichotomies, neither approach is always best, and there is a huge middle ground that can be designed from either or both perspectives.
Actor-based approaches are commonly used in the construction of daemons that interact with other systems. They are also employed when defining intrinsically active entities, for example the GamePlayer in 3.2.4. Their main methods often take a reactive looping form:
for(;;) { acceptAndProcessCommand(); }
Task-based approaches are commonly used when there is some conceptual or performance-based reason to execute a given task, service, or computation asynchronously rather than relying on direct procedural invocation. Task-based designs provide a separation of concerns between logical asynchrony and mappings to threads and thread-based constructions. They receive the bulk of discussion in this chapter.
As an initial example, here is one way to approach a common thread-based design, a web service. Here, a running WebService is a "daemon process" actor-style thread — it continuously interacts with its environment by listening for new incoming requests. But invocations to handler.process are issued in a task-based manner — a new task is set in motion to handle each incoming request. Here, for the sake of concise illustration, the request is simply a number, and the handler just returns the negation of the number back to the client.
class WebService implements Runnable { static final int PORT = 1040; // just for demo Handler handler = new Handler(); public void run() { try { ServerSocket socket = new ServerSocket(PORT); for (;;) { final Socket connection = socket.accept(); new Thread(new Runnable() { public void run() { handler.process(connection); }}).start(); } } catch(Exception e) { } // die } public static void main(String[ ] args) { new Thread(new WebService()).start(); } }class Handler { void process(Socket s) { DataInputStream in = null; DataOutputStream out = null; try { in = new DataInputStream(s.getInputStream()); out = new DataOutputStream(s.getOutputStream()); int request = in.readInt(); int result = -request; // return negation to client out.writeInt(result); } catch(IOException ex) {} // fall through finally { // clean up try { if (in != null) in.close(); } catch (IOException ignore) {} try { if (out != null) out.close(); } catch (IOException ignore) {} try { s.close(); } catch (IOException ignore) {} } } }
This chapter divides coverage of thread construction and structuring techniques as follows:
4.1 presents a series of options for implementing conceptually oneway messages, sometimes by asynchronously initiating tasks using threads or thread-based lightweight execution frameworks.
4.2 discusses the design of systems in which networks of components employ oneway messaging strategies.
4.3 presents alternatives for constructing threads that compute results or provide services to clients that initiate them.
4.4 examines problem decomposition techniques that can be used to improve performance by exploiting multiprocessors.
4.5 provides an overview of constructs and frameworks for designing systems of active objects, illustrated in part using CSP.
Many of the designs presented in this chapter straddle the borders among concurrent, distributed, and parallel programming. Presentations focus on concurrent, single-JVM solutions. But they include constructions often seen when developing the plumbing support for systems and frameworks involving multiple processes or computers.
4.1 Oneway Messages
A host object issues a logically oneway message to one or more recipients without depending on the consequences of that message. Sending a oneway message somehow results in some task being performed. The task might consist of only a single line of code, or might represent a session that entails acquisition of many resources and hours of computation. But the outcome of the thread issuing a oneway message does not rely on the task's outcome, or on when the task completes, or (normally) on whether it ever completes. Common examples include:
Events |
Mouse clicks, etc. |
Notifications |
Status change alerts |
Postings |
Mail messages, stock quotes, etc. |
Activations |
Creating Applets, daemons, etc. |
Commands |
Print requests, etc. |
Relays |
Message forwardings and dispatchings |
Oneway interactions between senders and recipients need not be strictly asynchronous. For example, the sender may be responsible for ensuring that a recipient actually receives the message. Also, the sender or another object may later wish to cancel or roll back the effects of the resulting task (which is of course not always possible, for example if the task has already completed — see 3.1.2).
If every task could run instantaneously, you might trigger oneway messages via procedural invocations in which the caller waits out the task triggered by the message, even though it has no reason to do so. But there are often performance-based, conceptual, and logistical reasons to issue some of these messages via thread-based constructions in which the associated tasks proceed independently.
4.1.1 Message Formats
Many different styles of invocation are encompassed under the notion of oneway message passing. While some of them are more closely associated with distributed or multiprocess applications (see 1.2.2), any of them can be used in conjunction with the constructions discussed in this section. In addition to direct method invocations, message formats may include:
Command strings. The recipient must parse, decode, and then dispatch the associated task. Command string messages are widely used in socket-based and pipe-based communication, especially in web services.
Event objects. The message contains a structured description of an event. The recipient then dispatches some arbitrary handling task that it associates with the event. Event objects are used extensively in GUI frameworks such as java.awt, as well as component frameworks supported by java.beans.
Request objects. The message contains an encoding of a method name and (marshalled or serialized) arguments. The recipient issues the corresponding method call to a helper object that performs this method. Request objects are used in distributed object support systems such as those in java.rmi and org.omg.corba. Variants are used in Ada tasking.
Class objects. The message is a representation of a class (for example via a .class file) which the recipient then instantiates. This scheme is used in the java.applet framework, as well as in remote activation protocols.
Runnable objects. The message consists of some code that the recipient executes. Mixed forms of runnable events (which include both an event description and an associated action) are used in some event frameworks. Extended forms employing serialized runnable objects are seen in mobile agent frameworks.
Arbitrary objects. A sender may treat any kind of object as a message by including it as method argument or passing it through a Channel (see 4.2.1). For example, in the JavaSpaces™ framework, senders may post any serialized object as a message (also known as an entry). Recipients accept only those entries with types and field values that conform to a specified set of matching criteria. Recipients then process these objects in any appropriate manner.
Differences among these formats reflect (among other things) how much the caller knows about the code the recipient needs to run to perform its task. It is often both most convenient and most efficient to use runnable objects, especially in thread-based frameworks that use instances of class Runnable as arguments in Thread constructors. We'll focus on this form, but occasionally illustrate others.
4.1.2 Open Calls
Consider the central Host object in a call chain in which the Host receives req requests from any number of Clients and, in the course of processing them, must issue logically oneway handle messages to one or more Helper objects. Again, we'll ignore the facts that an arbitrary amount of effort might be needed to decode the request before acting upon it, that the request might actually be read from a socket as seen in the WebService class, and so on. Also, all classes discussed in this section can be extended to issue multicasts to multiple helpers using the constructions described in 2.4.4 and 3.5.2.
The main design force here is latency. If a Host is busy servicing requests, then it cannot accept new ones. This adds response time to new requests from Clients, reducing overall service availability.
Some aspects of latency can be addressed simply by using the pass-through and open call designs described in 2.4:
class OpenCallHost { // Generic code sketch protected long localState; protected final Helper helper = new Helper(); protected synchronized void updateState(...) { localState = ...; } public void req(...) { updateState(...); helper.handle(...); } }
Here, even if the helper.handle call is relatively time-consuming, the Host object will still be able to accept new requests from clients running in different threads. The request acceptance rate is bounded only by the time it takes to update local state.
The use of open calls typically eliminates bottleneck points surrounding a given Host, but does not address the broader question of how to introduce concurrency into a system to begin with. Open calls are useful only when clients somehow already know enough to use some other approach that permits independent execution when necessary or desired.
4.1.3 Thread-Per-Message
Concurrency can be introduced into oneway messaging designs by issuing a message in its own thread, as in:
class ThreadPerMessageHost { // Generic code sketch protected long localState; protected final Helper helper = new Helper(); protected synchronized void updateState() { localState = ...; } public void req(...) { updateState(...); new Thread(new Runnable() { public void run() { helper.handle(...); } }).start(); } }
This strategy improves throughput when multiple parallel tasks can run faster than a sequence of them could, normally because they are either IO-bound or are compute-bound and running on a multiprocessor. It can also enhance fairness and improve availability if clients need not wait for each other's tasks to complete.
Decisions about whether to create and start threads to perform tasks are not too different from decisions about whether to create other kinds of objects or send other kinds of messages: The benefits must outweigh the costs.
Thread-per-message designs introduce response latency because thread creation is more expensive than direct method invocation. When tasks are time-consuming compared to thread construction time, are session-based, need to be isolated from other independent activities, or can exploit IO or CPU parallelism, the trade-offs are generally worth it. But performance problems can emerge even when construction latencies are acceptable. The JVM implementation and/or operating system may not respond well to the construction of too many threads. For example, they may run out of system resources associated with threads. Also, as the number of threads increases, thread scheduling and context switching overhead can overwhelm processing times.
4.1.3.1 Executors
The coding style seen in class ThreadPerMessage can become a problem because of its direct reliance on class Thread. Such usages can make it more difficult to adjust thread initialization parameters, as well as thread-specific data (see 2.3.2) used across an application. This can be avoided by creating an interface, say:
interface Executor { void execute(Runnable r); }
This interface can be implemented with classes such as:
class PlainThreadExecutor implements Executor { public void execute(Runnable r) { new Thread(r).start(); } }
These implementations may be used in classes such as:
class HostWithExecutor { // Generic code sketch protected long localState; protected final Helper helper = new Helper(); protected final Executor executor; public HostWithExecutor(Executor e) { executor = e; } protected synchronized void updateState(...) { localState = ...; } public void req(...) { updateState(...); executor.execute(new Runnable() { public void run() { helper.handle(...); } }); } }
The use of such interfaces also permits replacement of threads with lightweight executable frameworks.
4.1.4 Worker Threads
Lightweight executable frameworks fill the gap between open calls and thread-per-message designs. They apply when you need to introduce limited concurrency, at the expense of some usage restrictions, in order to maximize (or at least improve) throughput and minimize average latencies.
Lightweight executable frameworks can be constructed in many ways, but all stem from the basic idea of using one thread to execute many unrelated tasks (here, in succession). These threads are known as worker threads, background threads, and as thread pools when more than one thread is used.
Each worker continually accepts new Runnable commands from hosts and holds them in some kind of Channel (a queue, buffer, etc. — see 3.4.1) until they can be run. This design has the classic form of a producer-consumer relationship: the host produces tasks and workers consume them by running them.
Lightweight executable frameworks can improve the structure of some task-based concurrent programs, by allowing you to package many smaller, logically asynchronous units of execution as tasks without having to worry much about performance consequences: Entering a Runnable into a queue is likely to be faster than creating a new Thread object. And because you can control the number of worker threads, you can minimize chances of resource exhaustion and reduce context-switching overhead. Explicit queuing also permits greater flexibility in tuning execution semantics. For example, you can implement Channels as priority queues that order tasks with more deterministic control than is guaranteed by Thread.setPriority. (See 4.3.4 for an example.)
To interoperate with pure thread-based versions, worker threads can be packaged as Executors. Here is a generic implementation that could be used in the HostWithExecutor class instead of the thread-per-message version:
class PlainWorkerPool implements Executor { protected final Channel workQueue; public void execute(Runnable r) { try { workQueue.put(r); } catch (InterruptedException ie) { // postpone response Thread.currentThread().interrupt(); } } public PlainWorkerPool(Channel ch, int nworkers) { workQueue = ch; for (int i = 0; i < nworkers; ++i) activate(); } protected void activate() { Runnable runLoop = new Runnable() { public void run() { try { for (;;) { Runnable r = (Runnable)(workQueue.take()); r.run(); } } catch (InterruptedException ie) {} // die } }; new Thread(runLoop).start(); } }
4.1.4.1 Design choices
The first decision to make surrounding lightweight executable frameworks based on worker threads is whether to create or use them at all. The main question is whether there is some property of ordinary Threads that you do not need or are willing to give up. If not, it is unlikely that you will arrive at a solution that outperforms the built-in thread support on production JVM implementations.
The trade-offs that obtain the performance advantages of worker threads have several additional tunable parameters, usage consequences, and programming obligations that can impact the design and use of worker thread classes (including those contained in the util.concurrent package available from the online supplement).
Identity
Most worker threads must be treated "anonymously". Because the same worker thread is reused for multiple tasks, the use of ThreadLocal and other thread-specific contextual control techniques (see 2.3.2) becomes more awkward. To cope with this, you need to know about all such contextual data, and somehow reset it if necessary upon executing each task. (This includes information about security contexts maintained by run-time support classes.) However, most lightweight executable frameworks avoid any reliance on thread-specific techniques.
If identity is the only property of threads you are willing to give up, then the only potential performance value of worker threads is minimization of start-up overhead by reusing existing threads to execute multiple Runnable tasks, while still possibly bounding resource consumption.
Queuing
Runnable tasks that are sitting in queues do not run. This is one source of performance benefits in most worker-thread designs — if each action were associated with a thread, it would need to be independently scheduled by the JVM. But as a consequence, queued execution cannot in general be used when there are any dependencies among tasks. If a currently running task blocks waiting for a condition produced by a task still waiting in the queue, the system may freeze up. Options here include:
Use as many worker threads as there are simultaneously executing tasks. In this case, the Channel need not perform any queuing, so you can use SynchronousChannels (see 3.4.1.4), queueless channels that require each put to wait for a take and vice versa. Here, the host objects merely hand off tasks to worker threads, which immediately start executing them. For this to work well, worker thread pools should be dynamically expandable.
Restrict usage to contexts in which task dependencies are impossible, for example in HTTP servers where each message is issued by an unrelated external client requesting a file. Require the helper objects to create actual Threads when they cannot ensure independence.
Create custom queues that understand the dependencies among the particular kinds of tasks being processed by the worker threads. For example, most pools used for processing tasks representing transactions (see 3.6) must keep track of transaction dependencies. And the lightweight parallel framework described in 4.4.1 relies on special queuing policies that apply only to subtasks created in divide-and-conquer algorithms.
Saturation
As the request rate increases, a worker pool will eventually become saturated. All worker threads will be processing tasks and the Host object(s) using the pool will be unable to hand off work. Possible responses include:
Increase the pool size. In many applications, bounds are heuristic estimates. If a bound is just a guess based on values shown to work well on a particular platform under test workloads, it can be increased. At some point, though, one of the other options must be taken unless you can tolerate failure if the JVM runs out of enough resources to construct a new Thread.
If the nature of the service allows it, use an unbounded buffered channel and let requests pile up. This risks potential system failure due to exhaustion of memory, but this takes longer to happen than does resource exhaustion surrounding Thread construction.
Establish a back-pressure notification scheme to ask clients to stop sending so many requests. If the ultimate clients are part of a distributed system, they may be able to use another server instead.
Drop (discard) new requests upon saturation. This can be a good option if you know that clients will retry anyway. However, unless retries are automatic, you need to add callbacks, events, or notifications back to clients to alert them of the drops so that they will know enough to retry (see 4.3.1).
Make room for the new request by dropping old requests that have been queued but not yet run, or even cancelling one or more executing tasks. This preference for new requests over old ones upon saturation sometimes meshes well with usage patterns. For example, in some telecommunications systems, old unserviced tasks are usually requests by clients that have already given up and disconnected.
Block until some thread is available. This can be a good option when handlers are of predictable, short-lived duration, so you can be confident that the wait will unblock without unacceptable delays.
The Host can run the task directly itself, in its current thread. This is often the best default choice. In essence, the Host momentarily becomes single-threaded. The act of servicing the request limits the rate at which it can accept new requests, thus preventing further local breakdowns.
Thread management
The PlainWorkerPool class is somewhat wasteful because it creates all worker threads upon start-up, whether they are needed or not, and lets them all live on indefinitely, even when the service is not being used. These problems can be alleviated by using a management class that supports:
Lazy construction: Activate a new thread only when a request cannot be serviced immediately by an existing idle thread. Lazy construction allows users to provide large enough pool size limits to avoid underutilization problems occurring when fewer threads are running than a given computer can handle. This comes at the minor expense of occasionally higher latencies when a new request causes a new thread to be created. The start-up effects of lazy construction can be tempered by creating a small number of "warm" threads upon construction of the pool.
Idle time-outs: Allow threads to time out waiting for work and to terminate upon time-out. This eventually causes all workers to exit if the pool is not used for prolonged periods. When coupled with lazy construction, these dead threads will be replaced with new ones if the request rate later increases.
In heavily resource-conscious applications, you may also associate other resources (such as sets of reusable graphical objects) with each worker thread, thus combining resource pools (see 3.4.1.2) with thread pools.
Cancellation
You may need to distinguish cancellation (see 3.1.2) of a task from cancellation of the worker thread performing that task. One approach is:
Upon interruption, allow the current worker thread to die, but replace it if necessary with a fresh worker thread if the work queue is not empty or when a new incoming task arrives.
Provide a shutdown method in the worker thread class that causes existing workers to die and no additional workers to be created.
Additionally, you may need to trigger some kind of error handling if a Host thread is cancelled during a task hand-off. While the silent swallowing of InterruptedException without queuing a task seen in PlainWorkerPool conforms to the minimal requirements of oneway message-passing frameworks, most applications need to take other remedial actions.
4.1.4.2 Event queues
Many event-based frameworks (including the ones supported in the java.awt and javax.swing packages) rely on designs in which exactly one worker thread operates on an unbounded queue. The queue holds instances of EventObject that must be dispatched (as opposed to Runnable objects that self-dispatch), normally to listener objects defined by the application. Often the listeners are the same objects as those that initially generate events.
The use of a single thread operating on a single event queue simplifies usage compared to general worker-thread designs, but also imposes some limitations that are characteristic of event frameworks:
The ordering properties of a queue can be exploited to optimize handling. For example, automatic event-filtering techniques can be used to remove or combine duplicate repaint events for the same screen area before they hit the front of the queue and are taken by the worker thread.
You can require that all methods operating on certain objects be invoked only by issuing events onto the queue, and are thus ultimately performed by the single worker thread. This results in a form of thread confinement (see 2.3.2) of these objects. If flawlessly adhered to, this eliminates the need for dynamic locking within operations on these objects, thus improving performance. This can also reduce complexity for applications that do not otherwise need to construct threads.
This is the basis for the Swing single-thread rule: With only a few exceptions, all manipulation of Swing objects must be performed by the event handler thread. While not stated in AWT, it is good idea to observe this rule there as well.
Events should not be enabled until their handlers are fully constructed and are thus ready to handle events. This holds as well for other thread-based designs (see 2.2.7), but is a more common source of error here because registering an event handler or listener inside its constructor is not as obvious a way to prematurely enable concurrent execution as is constructing a thread.
Users of the event framework must never dispatch actions that block in ways that can unblock only as a result of handling a future event. This problem is encountered when implementing modal dialogs in most event frameworks, and requires an ad-hoc solution. However, more localized solutions can be obtained merely by setting a disabled state for interactive components that should not be used until a certain re-enabling event is received. This avoids blocking the event queue without allowing undesired actions to be triggered.
Further, to maintain responsiveness of the event framework, actions should not block at all, and should not perform time-consuming operations.
This set of design choices causes event frameworks to have much better performance than would thread-per-event designs, and makes them simpler to program by developers who do not otherwise use threads. However, the usage restrictions have more impact in programs that do construct other threads. For example, because of the single-thread rule, even the smallest manipulations of GUI components (such as changing the text in a label) must be performed by issuing runnable event objects that encapsulate an action to be performed by the event handler thread.
In Swing and AWT applications, the methods javax.swing.SwingUtilities.invokeLater and java.awt.EventQueue.invokeLater can be used to execute display-related commands in the event handler thread. These methods create runnable event objects that are executed when taken from the queue. The online supplement contains links to a SwingWorker utility class that partially automates conformance to these rules for threads that produce results leading to screen updates.
4.1.4.3 Timers
The fact that Runnable tasks in worker thread designs may sit queued without running is a problem to be worked around in some applications. But it sometimes becomes a feature when actions are intended to be delayed.
The use of worker threads can both improve efficiency and simplify usage of delayed and periodic actions — those triggered at certain times, after certain delays, or at regular intervals (for example, every day at noon). A standardized timer facility can both automate messy timing calculations and avoid excess thread construction by reusing worker threads. The main trade-off is that if a worker blocks or takes a long time processing one task, the triggering of others may become delayed longer than they would be if separate Threads are created and scheduled by the underlying JVM.
Time-based daemons can be constructed as variants of the basic worker thread design described in 4.1.4.1. For example, here are the highlights of a version that relies on an unshown priority queue class (that might take a form similar to the scheduling queue illustrated in 4.3.4) and is set up to support only one worker thread:
class TimerDaemon { // Fragments static class TimerTask implements Comparable { // ... final Runnable command; final long execTime; // time to run at public int compareTo(Object x) { long otherExecTime = ((TimerTask)(x)).execTime; return (execTime < otherExecTime) ? -1 : (execTime == otherExecTime)? 0 : 1; } } // a heap or list with methods that preserve // ordering with respect to TimerTask.compareTo static class PriorityQueue { void put(TimerTask t); TimerTask least(); void removeLeast(); boolean isEmpty(); } protected final PriorityQueue pq = new PriorityQueue(); public synchronized void executeAfterDelay(Runnable r,long t){ pq.put(new TimerTask(r, t + System.currentTimeMillis())); notifyAll(); } public synchronized void executeAt(Runnable r, Date time) { pq.put(new TimerTask(r, time.getTime())); notifyAll(); } // wait for and then return next task to run protected synchronized Runnable take() throws InterruptedException { for (;;) { while (pq.isEmpty()) wait(); TimerTask t = pq.least(); long now = System.currentTimeMillis(); long waitTime = now - t.execTime; if (waitTime <= 0) { pq.removeLeast(); return t.command; } else wait(waitTime); } } public TimerDaemon() { activate(); } // only one void activate() { // same as PlainWorkerThread except using above take method } }
The techniques discussed in 3.7 can be used here to improve efficiency of the waiting and notification operations.
This class can be extended to deal with periodic tasks by including additional bookkeeping to requeue them before running them. However, this also requires dealing with the fact that periodically scheduled actions are almost never exactly periodic, in part because timed waits do not necessarily wake up exactly upon the given delays. The main options are either to ignore lags and reschedule by clock time, or to ignore the clock and reschedule the next execution at a fixed delay after starting the current one. Fancier schemes are typically needed for multimedia synchronization — see the Further Readings in 1.3.5.
Timer daemons1 can additionally support methods that cancel delayed or periodic actions. One approach is to have executeAt and other scheduling methods accept or return suitably a reworked TimerTask supporting a cancel method that sets a status flag honored by the worker thread.
4.1.5 Polling and Event-Driven IO
Most worker thread designs rely on blocking channels in which the worker thread waits for incoming commands to run. However, there are a few contexts in which optimistic-style retry loops provide a better solution. Most involve the execution of commands stemming from messages received across IO streams.
It can be a challenge to achieve low latencies and high throughputs in heavily loaded IO-bound systems. The time taken to create a thread that performs an IO-based task adds latency, but most run-time systems are tuned such that, once threads are created, they are very responsive to new inputs arriving on IO streams. On input, they unblock with shorter latencies than you are likely to achieve via other techniques. Especially in the case of socket-based IO, these forces generally favor thread-per-IO-session designs, where a different thread is used (or reused) for each session relying on input from a different connection.
However, as the number of simultaneously active connections climbs, other approaches are (only) sometimes more attractive. Consider for example, a multiplayer game server, or a transaction server, with:
Thousands of simultaneous socket connections that join and leave at a steady rate, for example, as people start and finish playing a game.
Relatively low input rates on any given socket at any given time. However, summing across all connections, the aggregate IO rates may be very high.
Non-trivial computation associated with at least some inputs, for example those that cause global state changes in games.
On large mainframe systems, this kind of problem is sometimes dealt with by creating a special-purpose front-end machine that multiplexes all of the inputs into a single stream that is then dealt with by the main service. The main service is often multithreaded, but its structure is simplified and made more efficient because it does not need to deal with so many apparent clients at a time.
A family of polling and event-driven designs approach such problems without requiring special front ends. While they are not (as of this writing) explicitly supported by the java.io and java.net classes, enough of the ingredients are provided to allow construction of designs that can attain good performance in these kinds of situations. (The designs are analogous to those using socket select and poll operations in other systems and languages.) We'll illustrate with inputs on sockets, but the approach also applies to outputs, to files, and to IO using more exotic devices such as sensors.
4.1.5.1 Event-driven tasks
Many IO-based tasks are initially written in a session-based style (see 2.3.1), continuously pulling commands from sockets and processing them. For example:
class SessionTask implements Runnable { // Generic code sketch protected final Socket socket; protected final InputStream input; SessionTask(Socket s) throws IOException { socket = s; input = socket.getInputStream(); } public void run() { // Normally run in a new thread byte[ ] commandBuffer = new byte[BUFFSIZE]; try { for (;;) { int bytes = input.read(commandBuffer, 0, BUFFSIZE); if (bytes != BUFFSIZE) break; processCommand(commandBuffer, bytes); } } catch (IOException ex) { cleanup(); } finally { try { input.close(); socket.close(); } catch(IOException ignore) {} } } }
To enable many sessions to be handled without using many threads, the tasks first must be refactored into an event-driven style, where an event here signifies IO availability. In this style, a session consists of possibly many executions of its event-triggered task(s), each of which is invoked when input becomes available. Event-driven IO tasks are similar in form to GUI event handlers. A session-based design can be converted into an event-driven form by:
Isolating the basic per-command functionality in a reworked task run method that reads one command and performs the associated action.
Defining the run method so that it can be repeatedly triggered whenever input is available to be read (or an IO exception occurs).
Manually maintaining completion status so that the per-event action is no longer triggered when the session finishes, normally because the input has been exhausted or the connection has been closed.
For example:
class IOEventTask implements Runnable { // Generic code sketch protected final Socket socket; protected final InputStream input; protected volatile boolean done = false; // latches true IOEventTask(Socket s) throws IOException { socket = s; input = socket.getInputStream(); } public void run() { // trigger only when input available if (done) return; byte[ ] commandBuffer = new byte[BUFFSIZE]; try { int bytes = input.read(commandBuffer, 0, BUFFSIZE); if (bytes != BUFFSIZE) done = true; else processCommand(commandBuffer, bytes); } catch (IOException ex) { cleanup(); done = true; } finally { if (!done) return; try { input.close(); socket.close(); } catch(IOException ignore) {} } } // Accessor methods needed by triggering agent: boolean done() { return done; } InputStream input() { return input; } }
4.1.5.2 Triggering
When the events driving each event-driven task are relatively infrequent, a large number of tasks can be processed by a small number of worker threads. The simplest case occurs when the number of worker threads is exactly one. Here, the worker thread repeatedly polls a list of open sockets to see if they have any input available (via InputStream.available) or have encountered other IO-related status changes. If so, the worker executes the associated run method.
This style of worker thread differs from the ones in 4.1.4.1 in that, rather than pulling tasks from a blocking queue and blindly running them, the worker must repeatedly check a list of registered tasks to see if any can be run. It removes each task from the list only when it claims to have completed.
One generic form is:
class PollingWorker implements Runnable { // Incomplete private List tasks = ...; private long sleepTime = ...; void register(IOEventTask t) { tasks.add(t); } void deregister(IOEventTask t) { tasks.remove(t); } public void run() { try { for (;;) { for (Iterator it = tasks.iterator(); it.hasNext();) { IOEventTask t = (IOEventTask)(it.next()); if (t.done()) deregister(t); else { boolean trigger; try { trigger = t.input().available() > 0; } catch (IOException ex) { trigger = true; // trigger if exception on check } if (trigger) t.run(); } } Thread.sleep(sleepTime); // pause between sweeps } } catch (InterruptedException ie) {} } }
Several design concerns arise here:
Polling intrinsically relies on busy-wait loops (see 3.2.6), which are intrinsically wasteful (but still sometimes less so than context-switching). Coping with this requires empirically guided decisions about how to insert sleeps, yields, or alternative actions to strike a balance between conserving CPU time and maintaining acceptable average response latencies.
Performance is very sensitive to the characteristics of the underlying data structure maintaining the list of registered tasks. If new tasks come and go regularly, the list of tasks can change fairly frequently. In this case, schemes such as copy-on-write (see 2.4.4) usually do not work well. But there is every reason to make traversal of the list as cheap as possible. One approach is to maintain a cached list for traversal and to update it (if necessary) only at the end of each sweep.
Event-driven tasks should be triggered only when they have enough data to perform their associated actions. However, in many applications (for example those using free-form string-based commands), the minimal amount of data needed for triggering is not known in advance. In practice (as illustrated here), it usually suffices just to check that at least one byte is available. This exploits the fact that socket-based clients send packets — normally each packet contains an entire command. However, when commands do not arrive as units, the worker thread can stall, thus increasing latencies of other tasks unless buffering schemes are added.
A single worker thread is not likely to be acceptable if some inputs lead to time-consuming computations or blocking IO. One solution is to require that such computations be performed in new threads or by separate worker thread pools. However, it is sometimes more efficient instead to employ multiple polling worker threads; enough so that on average there will always be a thread polling for inputs.
The use of multiple polling worker threads requires additional coordination to make sure that two workers are not both trying to run the same task at the same time, without otherwise impeding each other's sweeps through the list of tasks. One approach is to have task classes set and honor busy status, for example, via testAndSet (see 3.5.1.4).
Given these concerns and the context dependence of the associated design decisions, it is not surprising that most frameworks are custom-built to suit the demands of particular applications. However, the util.concurrent package available from the online supplement includes some utilities that can be used to help build standardized solutions.
4.1.6 Further Readings
Most details about messages, formats, transports, etc., used in practice are specific to particular packages and systems, so the best sources are their accompanying manuals and documentation.
Discussions of message passing in distributed systems can be found in the sources listed in 1.2.5. Any of several packages and frameworks can be used to extend the techniques discussed here to apply in distributed contexts. For example, most of these designs (as well as most in 4.2 and elsewhere in this book) can be adapted for use in JavaSpaces. Conversely, many distributed message passing techniques can be scaled down to apply in concurrent, non-distributed settings.
Design and implementation using JavaSpaces is discussed in:
Freeman, Eric, Susan Hupfer, and Ken Arnold. JavaSpaces™: Principles, Patterns, and Practice, Addison-Wesley, 1999.
For different approaches, see for example the Aleph, JMS, and Ninja packages, accessible via links from the online supplement. Many commercial distributed systems are based on CORBA and related frameworks, which also include some support for oneway message passing. See:
Henning, Michi, and Steve Vinoski. Advanced CORBA Programming with C++, Addison-Wesley, 1999.
Pope, Alan. The CORBA Reference Guide, Addison-Wesley, 1998.
Some systems-level oneway messaging strategies otherwise similar to those presented here are described in:
Langendoen, Koen, Raoul Bhoedjang, and Henri Bal. "Models for Asynchronous Message Handling", IEEE Concurrency, April-June 1997.
An argument that single-queue, single-thread event frameworks are a better basis for application programming than thread-based frameworks may be found in:
Ousterhout, John. "Why Threads Are a Bad Idea (For Most Purposes)", USENIX Technical Conference, 1996.