The Recipe
You can follow many different approaches to testing. You can test the rough functionality first and then selectively refine it. You can select a functional, architectural, or design section and dive into it thoroughly. There is value in taking a more ad hoc or random approach akin to manual exploratory testing. Yet another approach guides your testing by metrics, such as defect densities, complexity, or criticality.
This section details an approach, visualized in Figure 3-1, that I have found useful in driving toward high coverage, independent of how you select the code or functionality to test. The approach favors deep over broad testing. It works well when taking a test-driven approach to new code, but also applies well when reproducing bugs, enhancing existing code, or bringing existing code under test.
Figure 3-1: Visualization of the testing recipe
Test the “Happy Path”
The “happy path” of code or functionality is the primary purpose, the main reason the software exists. If you composed a single sentence to describe what the software does, you would describe the happy path.
Testing the happy path lays the foundation on which the rest of your tests are built. It establishes the context in which all of the variations add further value. In the sense of tests as documentation, it expresses the purpose in an executable form that can capture regressions and evolve with the functionality.
The happy path may require several tests to fully verify depending on the scope of the test. Start with the one or two characteristic purposes for the initial tests. For a unit test, a single test should capture it. For a full stack system test, it may require a suite of tests, but breaking that suite down into functional areas should make the task at hand manageable.
Test the Alternate Paths
Once you have established that the primary functionality works as intended, you can tackle the useful variations of normal behavior. For example, if the primary functionality was to save a file, special accommodations for network file systems might be a good alternate path. At a unit-test level, you might make sure that an event-processing loop functions properly when no events are queued.
At this point, your coverage targets should guide your thoroughness. If you are doing coverage-driven unit testing, then you will want to test exhaustively. You will most likely direct yourself by a sense of functional coverage for full stack system tests. Subsystem integration tests will strive for a more local definition of completeness.
Test the Error Paths
Many people stop before testing the error handling of their software. Unfortunately, much of the perception of software quality is forged not by whether the software fails, because it eventually will, but by how it handles those failures. The world is full of unexpected occurrences. Even if your software runs on stand-alone, hardened computers, it will eventually fail. Power fluctuations, magnetic interference, and component failure are just a few of the many things that can happen, often in a cascading chain.
Error-handling verification ensures that your response to the deviant variations in your environment are deliberate rather than accidental. Deliberate error handling should give your user the experience you desire and hopefully the one they desire as well.
Many organizations skip or skimp on error-path testing because of the difficulties involved. Generally, inducing errors in larger scopes is harder than in smaller ones. Simulating network errors at a class level is much easier than for a full stack application. Making error handling an architectural concern with clearly defined guidelines for how components should participate in that framework outlines the intent to verify at the lower levels, allowing you to focus on the correctness of the local behaviors.
Test the Data Permutations
Data5 drives almost all software. When testing user interfaces and public APIs, boundary and validation conditions significantly impact the security and stability of your software. At more programmatic levels, various forms of data-controlled behaviors can comprise non-trivial portions of the functionality. Even in statically typed languages like Java and C#, higher levels of abstraction in your system design naturally decrease the effectiveness of code coverage as a guide for complete testing. Dynamic languages and features like reflection-based execution compound the challenge.
Boundary Conditions
One of the more common forms of data variations in software behavior arises from boundary conditions. Boundary conditions occur for a wide range of reasons. Your happy and alternate path tests verify the behavior within normal input values, but may not test all input values. Boundary condition tests verify how the software behaves
- At the edges of the normal inputs to detect problems like off-byone errors
- At the edges of the abnormal inputs also to detect off-by-one errors
- Using anticipated variations of abnormal inputs for concerns like security
- Using specifically dysfunctional abnormal inputs such as divide-by-zero errors or using inputs that trigger contextually determined limits such as numerical accuracy or representation ranges
You may have tested some boundary conditions when testing error paths. However, looking at the variations from the perspective of boundary conditions can highlight omissions in error-handling logic and drive more thorough test coverage.
Natural or pragmatic value and resource constraints provide a rich vein of boundary conditions. Natural limits occur when using values with a naturally finite set of states. True/false and yes/no are the most trivial of these. Menu picks that ask the user to choose from a limited number of options also provide contextually natural constraints. Pragmatic limits like field lengths yield a rich source of boundary conditions, especially when you manipulate or append to the input data internal to the software. At the resource-constrained or extreme end of the spectrum, you can test limits like memory and file size.
Numerical and mathematical variations can be thought of as natural or pragmatic but have a broad yet specialized enough affinity to deserve their own treatment and attention. Division-by-zero errors are perhaps the most common mathematical issues in programming, requiring attention regardless of representation format or size. Value limits due to discrete representations continue to factor into consideration, as the migration to wider representations is balanced by the inevitable increase in data volumes. Precision presents a more complicated set of conditions to test, as accuracy issues affect both the code being tested and the test code.
Standards- and convention-based formats yield structured and predictable, yet sometimes complex, patterns from which to derive boundary conditions, particularly as they evolve. For example, the syntactic rules of the Domain Name System (DNS)6 are relatively simple. However, you can find opportunities for startling variations even within this simplicity. Security concerns drive people to attempt to validate domains. Those who choose not to validate them through lookup, regardless of whether for good or bad reasons, must make assumptions about the rules of domain names that go beyond the syntactic conventions. I have seen code that assumes that all top-level domains (TLDs) must be two or three characters in length, as was true for most of the original set of TLDs. This ignores the originally allocated single-letter domains used for administrative purposes and does not automatically account for the longer TLDs that have been and will be added, such as .name and .info. Expansion of the DNS syntax to allow non-European character sets adds another wrinkle to validation.
More ad hoc or unstructured sources provide some of the most challenging inputs to predict. Any free-form text field has numerous considerations to validate. The simplest may involve restrictions on or stripping of white space or selection from a limited character set. The more complex can include evaluating inputs to detect SQL injection or cross-site scripting attacks and natural language processing for semantic content.
Data-Driven Execution
Guiding tests by code coverage, particularly at the unit level, works well to test behavioral variations that derive from code structure. However, many constructs provide significant behavioral variations without explicit branches in the code. The so-called Fundamental Theorem of Software Engineering7 says, “We can solve any problem by introducing an extra level of indirection.”
A common data-driven scenario arises when processing command-line or some remote-invocation interfaces in which a dispatcher uses an Abstract Factory to generate Command pattern [DP] objects for execution, as shown in Listing 3-2. The function of the CommandFactory and each of the available Command implementations should be tested in their own right, but the CommandDispatcher integrates the behaviors to create a larger set of behaviors that cannot be identified through static analysis or evaluated for coverage.
Listing 3-2: A dispatcher using an Abstract Factory in a data-driven way to create Command pattern objects to do the work
class CommandDispatcher { private CommandFactory commandFactory; public void dispatch(String commandName) { Command command = commandFactory.createCommand(commandName); command.execute(); } }
When testing these constructs at the unit level, we should verify the correctness of the dispatch mechanism. Ideally, the definition of the dispatch targets is dynamic or separate in a manner conducive to independent testing. We should test each of the dispatch targets independently.
For tests at a larger scope, like system or integration tests, we must test each of the dynamic variations to ensure thorough testing of the software. A dispatch mechanism that works generically at the unit level typically has a well-defined and finite set of possibilities when integrated into a component or system.
Run-Time and Dynamic Binding
Most languages that run in a virtual machine and/or are dynamically bound like scripting languages have a feature called reflection. Reflection provides the ability to inspect the program’s namespace at runtime to discover or verify the existence of elements like classes, functions, methods, variables, attributes, return types, and parameters and, where applicable, invoke them.
The ability to access or invoke arbitrary symbols resembles a built-in form of data-driven execution based on data maintained by the runtime system but with a higher degree of capability and flexibility than most applications will create on their own. The power of reflection has led many teams to discourage or outright ban it from their applications to avoid some justifiably distasteful uses. In languages like Java (Listing 3-3) or Perl, this will not inhibit most applications excessively. Languages like Smalltalk and JavaScript (Listing 3-4) suffer without the use of these features. Even if your team avoids writing reflection-based code, many frameworks, like Java Spring and Quartz, use reflection extensively to enable configuration-based application assembly and dependency injection.
Listing 3-3: Basic dynamic invocation in Java using reflection, omitting error handling and exceptions
class Invoker { public static void invokeVoidMethodNoArgs(String className, String methodName) { Class clazz = Class.forName(className); Object object = clazz.newInstance(); Method method = class.getMethod(methodName, null); method.invoke(object, null); } }
Listing 3-4: Basic dynamic invocation in JavaScript
function invokeNoArgsNoReturn(object, func) { if (object[func] && typeof object[func] === "function") { object[func](); } }
Even less capable languages for reflection, such as C and C++, can exhibit some of the dynamic-binding properties of reflection-able language through POSIX dynamic library APIs like dlopen(3) as shown in Listing 3-5. This API gives the application the ability to load a shared library dynamically and to invoke functions within it, all by specifying the library and function names as strings under the constraint that the invocation signature is known.
Listing 3-5: Runtime binding with the POSIX dynamic library API in C without error handling
#include <dlfcn.h> int main(int argc, char **argv) { void *lib; void (*func)(void); lib = dlopen(argv[0], RTLD_LAZY); func = dlsym(lib, argv[1]); (*func)(); dlclose(lib); return 0; }
Just as in data-driven execution, tests need to verify that the mechanism for the dynamic invocation works at the unit level and that the assembled pieces work together at the higher levels.
Test the Defects
No matter how much you test your code, there will be defects. If your team does its job well, you will find all of your significant defects before you get to production. Regardless of when and by whom the defect is found, writing a test that duplicates the defect and then passes after the fix helps you know that you have fixed the defect and ensures that the defect remains fixed over time.
I prefer to test each defect, at least at the unit level. Every defect, including defects that can be broadly described as integration or interaction problems, trace to one or more defects in a unit of code. Perhaps the caller passes the wrong parameters or invokes functionality in the wrong order. Perhaps the callee does the wrong thing with the arguments or returns the wrong format or value. Maybe the synchronization is handled in a way that allows occasional race conditions. All of these and more can be duplicated and fixed at the unit level.