- Modularity
- Modules
- Modularity in Software Systems
- Modularity, Complexity, and Coupling
- Coupling in Modularity
- Key Takeaways
- Quiz
Modularity in Software Systems
Although the term “module” is used extensively in software engineering, defining what a software module is, is not as straightforward as one might expect. The ambiguity arises from the term’s long-standing use, during which its original meaning was obscured as software engineering evolved, leading to diverse reinterpretations and loss of a precise definition.
What makes a software module? Is it a library, a package, an object, a group of objects, or a service? Furthermore, what is a nonmodule software component, and how does it differ from a module?
Some argue that a module embodies a logical boundary, such as a namespace, a package, or an object, while a component signifies a physical boundary, encompassing artifacts such as services and redistributable libraries. However, the juxtaposition of logical and physical boundaries is not accurate. To understand why it’s not accurate, as well as what exactly a software module is, let’s go back in time and examine what was meant by “module” when the term was originally introduced to software design.
Software Modules
In his seminal paper “On the Criteria to Be Used in Decomposing Systems into Modules,” David L. Parnas (1971) succinctly defined a module as “a responsibility assignment” rather than just an arbitrary boundary around statements of a program.
Four years later, in their book Structured Design, Edward Yourdon and Larry L. Constantine (1975) described a module as “a lexically contiguous sequence of program statements, bounded by boundary elements, having an aggregate identifier.” Or, in simpler terms, a module is any collection of executable program statements meeting all of the following criteria (Myers 1979):
The statements implement self-contained functionality.
The functionality can be called from any other module.
The implementation has the potential to be independently compiled.
The self-contained functionality criterion implies that a specific functionality is encapsulated within a module, rather than, for example, being spread across multiple modules. Next, the module makes this functionality accessible to other modules of the system through its public interface. Ultimately, the module’s implementation can potentially be independently compiled. Consequently, according to this definition, the type of a module’s boundary—physical or logical—is not essential. As long as it has the potential of being extracted into an independent unit that can be compiled, it is a module. What is more important than the type of the module’s boundary is the functionality it implements and provides to other modules.
This focus on the well-defined functionality rather than the type of a boundary makes modules ubiquitous all across software design. (Micro)services, frameworks, libraries, namespaces, packages, objects, classes—all can be modules. Furthermore, because nowadays a class’s methods can be compiled independently,3 even individual methods/functions can be considered modules.
That means a service-based system can be modular if its services are designed as effective modules. A service of that system can be modular on its own if, for example, it consists of modular namespaces. Modular objects can form a modular namespace, and the same is true for methods or functions constituting objects. “It’s turtles all the way down,” as illustrated in Figure 4.2. Modules are not flat; modular design is hierarchical.
Figure 4.2 Hierarchical modular design
To reiterate, a module is a boundary encompassing a well-defined functionality, which it exposes for use by other parts of the system. Consequently, a module could represent nearly any type of logical or physical boundary within a software system, be it a service, a namespace, an object, or something else.
Throughout this book, I’ll use the term “module” to signify a boundary enclosing specific functionality. This functionality is exposed to external consumers and either is or has the potential to be independently compiled.
Function, Logic, and Context of Software Modules
We can use the three properties of a module—function, logic, and context—to describe all kinds of the aforementioned software modules.
Function
A software module’s function is the functionality it exposes to its consumers over its public interface. For example:
A service’s functionality can be exposed through a REST API or asynchronously through publishing and subscribing to messages.
An object’s function is expressed in its public methods and members.
The function of a namespace, package, or distributed library consists of the functionality implemented by its members.
If a distinct method or a function is treated as a module, its name and signature reflect its function.
Logic
A software module’s logic encompasses all the implementation and design decisions that are needed to implement its function. It includes its source code,4 as well as internal infrastructural components (e.g., databases, message buses) that are not needed for describing the module’s function.
Context
All types of software modules depend on various attributes of their execution environments and/or make assumptions regarding the context in which they operate. For example:
At a very basic level, a certain runtime environment is needed to execute a module. Moreover, a specific version of the runtime environment may be required.
A certain level of compute resources, such as CPU, memory, or network bandwidth, may be needed for the module to function properly.
A module may assume that the calls are pre-authorized instead of performing authorization itself.
Going back to the definition of a module’s context, the main difference between function and context is that the assumptions and requirements tied to the context are not reflected in the module’s public interface—its function.
Now that you have a solid understanding of what a software module is, let’s delve into the design considerations for designing a modular system.
Effective Modules
As noted in the previous sections, an arbitrary decomposition of a system into components won’t make it modular. The hierarchical nature of modules doesn’t make it any easier. Failing to properly design modules at any level in the hierarchy can potentially undermine the whole effort.
Effective design of modules is not trivial, and failures to do so can be spotted all across the history of software engineering. For example, not so long ago, many believed that a microservices-based architecture is the easy solution for designing flexible, evolvable systems. However, without a proper principle guiding the decomposition of a system into microservices (modules), many teams ended up with distributed monoliths—solutions that were much less flexible than the original design. As they say, history tends to repeat itself, and almost exactly the same situation happened when modularity was introduced to software design:
When I came on the scene (in the late 1960s) software development managers had realized that building what they called monolithic systems wasn’t working. They wanted to divide the work to be done into parts (which they called modules) and each part or module would be assigned to a different team or team member. Their hope was that (a) when they put the parts together they would “fit” and the system would work and (b) when they had to make changes, the changes would be confined to a single module. Neither of those things happened. The reason was that they were doing a “bad job” of dividing the work into modules. Those modules had very complex interfaces, and changes almost always affected many modules. —David L. Parnas, personal correspondence to author (May 3, 2023)
Following that experience, Parnas (1971) proposed a principle intended to guide more effective decomposition of systems into modules: information hiding. According to the principle, an effective module is one that hides decisions. If a decision has to be revisited, the change should only affect one module, the one that “hides” it, thus minimizing cascading changes rippling across multiple components of the system.
In Parnas’s later work (1985, 2003), he equated modules following the information-hiding principle to the concept of abstraction. Let’s see what an abstraction is, what makes an effective abstraction, and how to use this knowledge to craft module boundaries.
Modules as Abstractions
The goal of an abstraction is to represent multiple things equally well. For example, the word “car” is an abstraction. When thinking about a “car,” one does not need to consider a specific make, model, or color. It could be a Tesla Model 3, an SUV, a taxi, or even a Formula 1 race car; it could be red, blue, or silver. These specific details are not necessary to understand the basic concept of a car.
For an abstraction “to work,” it has to eliminate details that are relevant to concrete cases but are not shared by all. Instead, to represent multiple things equally well, it has to focus on aspects shared by all members of a group. Going back to the previous example, the word “car” simplifies our understanding by focusing on the common characteristics of all cars, such as their function of providing transportation and their typical structure, which often includes four wheels, an engine, and a steering wheel.
By focusing only on the details that are shared by a group of entities, an abstraction hides decisions that are likely to change. As a result, the more general an abstraction is, the more stable it is. Or, the fewer details that are shared by an abstraction, the less likely it is to change.
A well-designed module is an abstraction. Its public interface should focus on the functionality provided by the module, while hiding all the details that are not shared by all possible implementations of that functionality. Going back to the example of a repository object in Chapter 3, the interface described in Listing 4.1 focuses on the required functionality, while encapsulating the concrete implementation details.
Listing 4.1 A Module Interface That Focuses on the Functionality It Provides, While Encapsulating Its Implementation Details
interface CustomerRepository { Customer Load(CustomerId id); void Save(Customer customer); Collection<Customer> FindByName(Name name); Collection<Customer> FindByPhone(PhoneNumber phone); }
A concrete implementation of the repository could use a relational database, a document store, or even a polyglot persistence–based implementation that leverages multiple databases. Moreover, this design allows the consumers of the repository to switch from one concrete implementation to another, without being affected by the change.
The notion of effortlessly switching from one database to another often has a somewhat questionable reputation within the software engineering community. Such changes aren’t common.5 That said, there’s a more frequent and crucial need to switch the implementation behind a stable interface. When you’re altering a module’s implementation without changing its interface, such as fixing a bug or changing its behavior, you’re essentially replacing its implementation. For example, the kinds of queries used in the FindByName() and FindByPhone() methods can be changed even when retaining the use of the same database. It could be that an index, name, and phone number are added to the database schema itself. Or it could be that the data is restructured to better optimize queries. Neither of these changes should impact the client’s use of the module interface.
That said, the possibility of switching an implementation is not the only goal of introducing an abstraction. As Edsger W. Dijkstra (1972) famously put it, “The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.”
It may seem that using an abstraction introduces vagueness or lack of detail. However, as Dijkstra argues, that’s not the goal. Instead, an abstraction should create a new level of understanding—a “semantic level”—where one can be “absolutely precise.” Balance is needed to reach a proper level of abstraction to convey the correct semantics. Consider this: If you use an abstraction called “vehicle” to represent cars, it might be an overly broad generalization. Ask yourself: Are you actually modeling a range of vehicles, such as motorcycles and buses, necessitating such a wide-ranging abstraction? If the answer is no, then using “car” as your abstraction is more appropriate and precisely conveys the intended meaning.
By focusing on the essentials—functionality of modules—while ignoring extraneous information, abstractions allow us to reason about complex systems without getting lost in the details. A common example of a modular system is a personal computer. We can reason about the interactions of its modules—CPU, motherboard, random-access memory, hard drive, and others—all without understanding the intricate technicalities of each individual component. When troubleshooting a problem, we don’t need to comprehend how a CPU processes instructions or how a hard drive stores data at a microscopic level. Instead, we consider their roles within the larger system: a new semantic level provided by effective abstractions.
Finally, abstractions, like modules, are hierarchical. In software design, “levels of abstraction”6 are used to refer to different levels of detail when reasoning about systems. Higher levels of abstraction are closer to user-facing functionality, while lower levels are more about components related to low-level implementation details. Different levels of detail require different languages for discussing the functionalities implemented at each level. Those languages, or (as Dijkstra called them) semantic levels, are formed by designing abstractions.
Hierarchical abstractions also serve as further illustration of modularity’s hierarchical nature. Since abstractions adhere to the same design principles at all levels, modular design exhibits not only a hierarchical but also a fractal structure. In upcoming chapters, I will discuss in detail how the same rules govern modular structures at different scales. But for now, let’s revisit the topic of the previous chapters, complexity, and analyze its relationship with modularity.