- 3.1 Making Design Decisions
- 3.2 Design Concepts: The Building Blocks for Creating Structures
- 3.3 Design Concepts to Support Performance
- 3.4 Design Concepts to Support Availability
- 3.5 Design Concepts to Support Modifiability
- 3.6 Design Concepts to Support Security
- 3.7 Design Concepts to Support Integrability
- 3.8 Summary
- 3.9 Further Reading
- 3.10 Discussion Questions
3.3 Design Concepts to Support Performance
Performance is about time and the software system’s ability to meet timing requirements. Generally speaking, faster is better!
When events occur—interrupts, messages, requests from users or other systems, or clock events—the system must respond to them in time. Characterizing the events that can occur (and when they can occur) and the system’s response to those events is the starting point for discussing performance. All systems have performance requirements, even if they are not explicitly expressed.
Performance is often linked to scalability—increasing the system’s capacity, while ensuring that it still performs well. Often, performance is considered after you have constructed something and found its performance to be inadequate. You can fix this by architecting your system consciously with performance in mind.
3.3.1 Performance Tactics
The performance tactics categorization is shown in Figure 3.4. This set of tactics, like all tactics, helps an architect reason about the quality attribute. There are two major categories of performance tactics: Control Resource Demand and Manage Resources.
FIGURE 3.4 Performance tactics categorization
Within the Control Resource Demand category, the tactics are:
Manage work requests. One way to reduce work is to reduce the number of requests coming into the system. Ways to do that include managing work requests (i.e., limiting the number of requests the system will accept in a given time period) and managing the sampling rate (e.g., switching to a lower frame rate for streaming video).
Limit event response. When events arrive too rapidly to be processed, they must be queued until they can be processed, or until they are discarded. You may choose to process events only up to a set maximum rate, thereby ensuring predictable processing for other events.
Prioritize events. If not all events are equally important, you can impose a priority scheme that ranks events according to how urgently you want to service them.
Reduce computational overhead. For events that make it into the system, you can reduce the amount of work in handling each event by reducing indirection (making direct calls rather than going through an intermediary, for example), by co-locating communicating resources, and by periodic cleaning (such as garbage collection).
Bound execution times. You can place a limit on how much execution time is used to respond to an event.
Increase efficiency of resource usage. Improving the efficiency of algorithms and data structures (adding an index to database, for example) in critical areas can decrease latency and improve throughput and resource consumption.
Within the Manage Resources category, the tactics are:
Increase resources. Faster processors, additional processors, additional memory, and faster networks all have the potential to improve performance.
Introduce concurrency. If requests can be processed in parallel, blocked time can be reduced.
Maintain multiple copies of computations. This tactic reduces the contention that would occur if all requests were allocated to a single instance.
Maintain multiple copies of data. Two common examples of maintaining multiple copies of data are data replication and caching.
Bound queue sizes. This tactic controls the maximum number of queued arrivals and consequently the resources used to process the arrivals.
Schedule resources. Whenever contention for a resource occurs, the resource should be scheduled. Your concern is to understand the characteristics of each resource’s use and choose an appropriate scheduling strategy.
These tactics cover the spectrum of architectural concerns with respect to performance. Now we turn our attention to more complex design structures, patterns.
3.3.2 Performance Patterns
In what follows we provide a small selection of architectural patterns that address performance concerns. We make no attempt here to provide a comprehensive catalog; that is not the purpose of this book. In-depth coverage of performance patterns can be found in many other resources. We merely provide a few patterns here to stimulate thinking and to provide examples of the kinds of resources that are available to support the architect in reasoning about design for performance.
3.3.2.1 Load Balancer Pattern
A load balancer is an intermediary that handles messages from clients and determines which instance of a service should respond. The load balancer serves as a single point of contact for incoming messages and farms out requests to a pool of (typically stateless) providers.
By sharing the load among a pool of providers, latency can be kept lower and more predictable for clients. It is a simple matter to add more resources to the resource pool, and no client needs to be aware of this event (an instance of the increase resources, maintain multiple copies of computations, and introduce concurrency tactics). Moreover, any failure of a server is invisible to clients, assuming there are still some remaining processing resources. Although this is an availability benefit, we note once again that patterns often address multiple quality attributes, whereas tactics address just a single quality attribute.
The load balancing algorithm must be very fast; otherwise, it may itself contribute to performance problems. The load balancer is a potential bottleneck or single point of failure, so it is often replicated (and even load balanced).
3.3.2.2 Throttling Pattern
The throttling pattern packages the manage work requests tactic. It is used to limit access to an important service. In this pattern, an intermediary—a throttler—monitors the service and determines whether an incoming request can be serviced. By throttling incoming requests, you can gracefully handle variations in demand. In doing so, services never become overloaded; they can be kept in a performance “sweet spot” where they handle requests efficiently.
But, once again, there are tradeoffs to consider: The throttling logic must be very fast; otherwise, it may itself contribute to performance problems. If client demand regularly exceeds capacity, buffers will need to be large, or you may lose requests. Also, this pattern can be difficult to add to an existing system where clients and servers are tightly coupled.