Using Scatter Charts to Recognize Patterns in Performance Test Data
I got hooked on scatter charts a couple of years ago after seeing Scott Barber give a talk based on his article "Beyond Performance Testing Part 6: Interpreting Scatter Charts." I thought, "Wow, that’s cool. I want to be able to look at a big glob of data and be able to recognize patterns." The problem was that my scatter charts never looked like his. Of course he could read scatter charts—his applications failed in ways that were easy to identify in a scatter chart!
After a few frustrated email messages to Scott asking for help, I learned that his scatter charts didn’t typically start off looking like that. He had to manipulate them to find the information he wanted. So what’s the missing piece? How can you get your scatter charts to tell you a story? In this article, we’ll take a look at some techniques for manipulating your scatter charts so you can get them to tell you a story. This article is intended for the experienced performance tester who wants to be able to identify patterns in performance test data faster.
Why We Need Scatter Charts
Performance problems are difficult to solve for many reasons:
- The tools we use and the applications we test are buggy.
- The results presented by performance tools are often difficult to interpret.
- The number of variables in any performance test is beyond a mere mortal’s ability to keep in his or her head all at the same time—network settings, hardware configurations, application settings, script settings, etc.
- When we encounter a failure or a slowdown, it’s very difficult to determine where to start tuning.
Enter scatter charts. A scatter chart plots transactions with respect to response time and the time in the run. That is, if a transaction takes place 60 seconds into a run, and ends after 5 seconds, it will be plotted at point (60, 5) on a standard graph. Figure 1 shows an example.
Figure 1 Example of a scatter chart.
The chart in Figure 1 shows thousands of transactions for a single test. By looking at the x axis, you can see that the test ran for about 7,000 seconds. The y axis tells us that the slowest response time was 1,000 seconds. For this test, the acceptable response time was under 6 seconds. Clearly, we have a problem, but where do we start?
I once heard a performance testing expert refer to scatter charts in the following way: "With an order of magnitude fewer variables it could be a science, but for now there is a heavy reliance on the human brain to draw relationships based on past experience." That sums up the scatter chart analysis nicely. So how do you take that chart and turn it into something useful? You start by developing an understanding of what you’re looking at and then manipulating the data until it starts to make sense based on your understanding of the context.
Scatter charts are good for identifying patterns in response times over a whole run. You can display response time graphically to highlight instances of poor performance, and you can identify correlations between response times and resource usage over time. The charts are great for technical stakeholders, but not so great for non-technical stakeholders. (If you’re a non-technical stakeholder, you may want to bail now.) They also tend to be less useful for comparing results over multiple runs.