- Rule 7-Design to Clone or Replicate Things (X Axis)
- Rule 8-Design to Split Different Things (Y Axis)
- Rule 9-Design to Split Similar Things (Z Axis)
- Summary
- Notes
Rule 8—Design to Split Different Things (Y Axis)
When you put aside the religious debate around the concepts of services- (SOA) and resources- (ROA) oriented architectures and look deep into their underlying premises, they have at least one thing in common. Both concepts force architects and engineers to think in terms of separation of responsibilities within their architectures. At a high and simple level, they do this through the concepts of verbs (services) and nouns (resources). Rule 8, and our second axis of scale, takes the same approach. Put simply, Rule 8 is about scaling through the separation of distinct and different functions and data within a site. The simple approach to Rule 8 tells us to split up our product by either nouns or verbs or a combination of both nouns and verbs.
Let’s split up our site using the verb approach first. If our site is a relatively simple e-commerce site, we might break it into the necessary verbs of signup, login, search, browse, view, add to cart, and purchase/buy. The data necessary to perform any one of these transactions can vary significantly from the data necessary for the other transactions. For instance, while it might be argued that signup and login need the same data, they also require some data that is unique and distinct. Signup, for instance, probably needs to be capable of checking whether a user’s preferred ID has been chosen by someone else in the past, whereas login might not need to have a complete understanding of every other user’s ID. Signup likely needs to write a fair amount of data to some permanent data store, but login is likely a read-intensive application to validate a user’s credentials. Signup may require that the user store a fair amount of personally identifiable information (PII) including credit card numbers, whereas login does not likely need access to all of this information at the time that a user would like to establish a login.
The differences and resulting opportunities for this method of scale become even more apparent when we analyze obviously distinct functions like search and login. In the case of login we are mostly concerned with validating the user’s credentials and potentially establishing some notion of session (we’ve chosen the word session rather than state for a reason we explore in Rule 40 in Chapter 10, “Avoid or Distribute State”). Login is concerned with the user and as a result needs to cache and interact with data about that user. Search, on the other hand, is concerned with the hunt for an item and is most concerned with user intent (vis-à-vis a search string, query, or search terms typically typed into a search box) and the items that we have in stock within our catalog. Separating these sets of data allows us to cache more of them within the confines of memory available on our system and process transactions faster as a result of higher cache hit ratios. Separating this data within our back-end persistence systems (such as a database) allows us to dedicate more “in memory” space within those systems and respond faster to the clients (application servers) making requests. Both systems respond faster as a result of better utilization of system resources. Clearly we can now scale these systems more easily and with fewer memory constraints. Moreover, the Y axis adds transaction scalability by splitting up transactions in the same fashion as Rule 7, the X axis of scale.
Hold on! What if we want to merge information about the user and our products such as in the case of recommending products? Note that we have just added another verb—recommend. This gives us another opportunity to perform a split of our data and our transactions. We might add a recommendation service that asynchronously evaluates past user purchase behavior against users who have similar purchase behaviors. This in turn may populate data in either the login function or the search function for display to the user when he or she interacts with the system. Or it can be a separate synchronous call made from the user’s browser to be displayed in an area dedicated to the result of the recommend call.
Now how about using nouns to split items? Again, using our e-commerce example, we might identify certain resources upon which we will ultimately take actions (rather than the verbs that represent the actions we take). We may decide that our e-commerce site is made up of a product catalog, product inventory, user account information, marketing information, and so on. Using our noun approach, we may decide to split up our data into these categories and then define a set of high-level primitives such as create, read, update, and delete actions on these primitives.
While Y axis splits are most useful in scaling data sets, they are also useful in scaling code bases. Because services or resources are now split, the actions we perform and the code necessary to perform them are split up as well. This means that very large engineering teams developing complex systems can become experts in subsets of those systems and don’t need to worry about or become experts on every other part of the system. Teams that own each service can build the interface (such as an API) into their service and own it. Assuming that each team “owns” its own code base, we can cut down on the communication overhead associated with Brooks’ Law. One tenet of Brooks’ Law is that developer productivity is reduced as a result of increasing team sizes.3 The communication effort within any team to coordinate team efforts is a square of the number of participants in the team. Therefore, with increasing team size comes decreasing developer productivity as more developer time is spent on coordination. By segmenting teams and enabling ownership, such overhead is decreased. And of course because we have split up our services, we can also scale transactions fairly easily.