- The Network
- The Servers
- The Load Generators
- Hardware and Test Planning
- Summary
The Load Generators
Load generators (part of a performance test tool, as discussed in Chapter 8) generate the test environment "traffic." A performance test requires hundreds or thousands of users to simulate production conditions. The load generator uses prewritten test scripts to simulate the users and their activities at the web site. Generating large numbers of "virtual users" requires the right supporting equipment. In the field, we often find test teams running expensive, state-of-the-art performance test tools on old, underpowered PCs retrieved from storage. Without sufficient supporting equipment, the best tools on the market cannot generate sufficient load to properly test the web site. If the load generator cannot provide sufficient load to the web site, the web site never achieves the target throughput. Often the test team misreads the symptoms of this condition and spends weeks tuning the web site when, in fact, they need to add capacity to the load generator machines.
Likewise, the test team needs a network analysis of the traffic generated by each client machine. This includes inbound and outbound requests, just as we discussed earlier in this chapter. The traffic generated by the test tools sometimes overloads the network subnet supporting the client machines. Also, the traffic burden sometimes overloads the network cards in individual client machines.
In short, take the load generator environment seriously. The load generator requires more capacity than most teams originally estimate. If the client machines reach 75% CPU utilization or the network traffic passes the safety threshold, increase capacity on these devices. You cannot drive load against your web site if the load cannot make it to the intended servers.
Master/Slave Configurations
Many industrial performance tools use the master-slave test configuration, shown in Figure 9.6. The "master" machine manages each test and collects data. The "slaves" actually manage the threads and sockets representing the virtual users, and run the corresponding test scripts. For extremely small configurations (10 to 20 virtual users), the master and slave both run on the same machine. However, for larger tests, the master runs on a different machine, with one or more slaves also running on different boxes.
Figure 9.6 Master/slave load generator configuration
The best practices to remember with a master/slave configuration include the following:
Keep traffic between the master and slaves to a minimum. Many performance test tools allow the test manager to define the frequency with which updates and status travel between the slave and the master. If you're using a new performance test tool for this first time, use a protocol analyzer to determine the actual network burden on the client subnet.
The master (also known as the "controller" or "coordinator" machine) often requires less CPU capacity than the slave machines. Conversely, if it stores reports for the test cluster, it may require a larger hard drive than a slave machine.
Watch the CPU on all test machines during the performance test. If CPU utilization exceeds 75%, the machine needs more capacity.
Watch the logs and hard disk on the client machines. Often the clients accumulate large log files and pass these logs back to the master after the test completes. Frequent logging, of course, increases the disk I/O burden for the client machine and impacts testing. Likewise, if these logs accumulate over time, the client machines may not have enough room for subsequent runs. Before starting a run, make sure the client machines contain sufficient free disk space for any logging they may perform.
Recycle the machines frequently. The slaves and the controller sometimes throw odd errors or stop responding after several testing cycles. In general, we find it best to recycle the test machines once or twice a day during periods of extended testing.
Try to keep the test slave hardware homogenous. Often, because of load balancing techniques, one or two client machines may drive all the load a server in a clustered environment receives. If the test cluster contains one machine significantly more powerful than the others, some servers in the web site cluster may not achieve full loading.
It's often useful to simulate traffic from a number of different client IP addresses, especially when performing a test that utilizes an IP sprayer in front of several HTTP servers. Web sites often configure IP sprayers for affinity routing between the incoming user and a particular sever in the cluster.
You need sufficient test client machines, NICs, and supporting performance test software to make IP affinity work during your testing. See Chapter 8 for more details.
After the Performance Test
Many companies treat the performance test environment as a transient entity. The environment exists for a few weeks to test the performance and scalability of the web site. Afterward, the team pulls the test environment apart and uses the components to build other test environments or sends the equipment to production. This works well for most test scenarios, but major production web sites often require a permanent, separate test environment. A permanent test environment allows you to
Safely test new features and bug fixes prior to their introduction into the production web site.
Recreate problems seen on the production site without using production resources. Because of the multi-threaded nature of web applications, some problems only appear under load conditions. You need a reliable test environment to find these problems.
At a minimum, keep enough test client capacity to stress at least one production server. These clients may reside either with the test environment, if one exists, or may be a part of the production environment. Ideally, configure these clients to drive load in either the production or test environment, if needed.
Few companies dedicate test machines to a particular web site or web application, but this increases another risk: If you run into a problem in production, you don't have machines set aside where you can immediately try to reproduce the situation. The problem determination cycle often takes much longer when you can only work with the problem in production. Having available machines dedicated to testing and problem determination often makes it simpler and cheaper to recreate and debug production problems.