Measuring End-to-End Availability
To accurately estimate end-to-end application availability as experienced by end users, you must first thoroughly understand the system's configuration; all the components and resources used by the application, both local and remote; and the hardware and software components required to access those resources. Here's an example:
Sales Personnel Call-Management System Configuration
Local resources |
Sales personnel data, call reports |
Remote resources |
Contact management data at each sales rep's computer |
Hardware components |
Personal computer, LAN adapter, LAN cabling, network switch, print server, network printer |
Software components |
Windows 98, Microsoft Access, contact-management software, call-management application |
The next step is to monitor all these components for outages. If outages are detected on multiple components at the same time, treat the outage duration as just one instance. To calculate end-to-end availability, add all the outages of each component. Then apply the formula presented earlier in this chapter.
Easy in principle, but taxing in practice? Definitely. That's why you need to automate measurement as much as possible. The simplest way is to use a tool that monitors availability of local and remote resources from a user's PC. This tool regularly attempts to get a response from the resources in question, and records times when critical resources are unavailable. More advanced tools can query an application for problems or execute certain tasks on the application. If the application fails, an outage is recorded. This approach doesn't identify the source of the problem, but the error condition may help support staffers identify the cause.
There is a great demand for automated end-user system availability monitoring toolsutilities that can be installed in user workstations and periodically test the applications for availability. In the absence of such tools, you would have to resort to random sampling of users' availability experiences.
You won't get precise measurements of every user's availability experiencethat's unrealistic. But recognize that users to have availability requirements to which you must pay attention. Don't get too dependent on technical measurements for rating your performanceultimately, what matters most is that users are happy with the service that the IT organization provides.
Remember that the discussion in this section focuses on how availability is affected by hardware or software outages, because hardware and software outages make up the majority of the reasons for unavailability. But this isn't the only factor by which a user judges system availability. The system may not be experiencing an outage, but if it's running too slowly, a user may give up waiting and consider an application unavailable.