- Using Aggregates in the Scrum Core Domain
- Rule: Model True Invariants in Consistency Boundaries
- Rule: Design Small Aggregates
- Rule: Reference Other Aggregates by Identity
- Rule: Use Eventual Consistency Outside the Boundary
- Reasons to Break the Rules
- Gaining Insight through Discovery
- Implementation
- Wrap-Up
Rule: Reference Other Aggregates by Identity
When designing Aggregates, we may desire a compositional structure that allows for traversal through deep object graphs, but that is not the motivation of the pattern. [Evans] states that one Aggregate may hold references to the Root of other Aggregates. However, we must keep in mind that this does not place the referenced Aggregate inside the consistency boundary of the one referencing it. The reference does not cause the formation of just one whole Aggregate. There are still two (or more), as shown in Figure 10.5.
Figure 10.5. There are two Aggregates, not one.
In Java the association would be modeled like this:
public class BacklogItem extends ConcurrencySafeEntity { ... private Product product; ... }
That is, the BacklogItem holds a direct object association to Product.
In combination with what’s already been discussed and what’s next, this has a few implications:
- Both the referencing Aggregate (BacklogItem) and the referenced Aggregate (Product) must not be modified in the same transaction. Only one or the other may be modified in a single transaction.
- If you are modifying multiple instances in a single transaction, it may be a strong indication that your consistency boundaries are wrong. If so, it is possibly a missed modeling opportunity; a concept of your Ubiquitous Language has not yet been discovered although it is waving its hands and shouting at you (see earlier in this chapter).
- If you are attempting to apply point 2, and doing so influences a large-cluster Aggregate with all the previously stated caveats, it may be an indication that you need to use eventual consistency (see later in this chapter) instead of atomic consistency.
If you don’t hold any reference, you can’t modify another Aggregate. So the temptation to modify multiple Aggregates in the same transaction could be squelched by avoiding the situation in the first place. But that is overly limiting since domain models always require some associative connections. What might we do to facilitate necessary associations, protect from transaction misuse or inordinate failure, and allow the model to perform and scale?
Making Aggregates Work Together through Identity References
Prefer references to external Aggregates only by their globally unique identity, not by holding a direct object reference (or “pointer”). This is exemplified in Figure 10.6.
Figure 10.6. The BacklogItem Aggregate, inferring associations outside its boundary with identities
We would refactor the source to
public class BacklogItem extends ConcurrencySafeEntity { ... private ProductId productId; ... }
Aggregates with inferred object references are thus automatically smaller because references are never eagerly loaded. The model can perform better because instances require less time to load and take less memory. Using less memory has positive implications for both memory allocation overhead and garbage collection.
Model Navigation
Reference by identity doesn’t completely prevent navigation through the model. Some will use a Repository (12) from inside an Aggregate for lookup. This technique is called Disconnected Domain Model, and it’s actually a form of lazy loading. There’s a different recommended approach, however: Use a Repository or Domain Service (7) to look up dependent objects ahead of invoking the Aggregate behavior. A client Application Service may control this, then dispatch to the Aggregate:
public class ProductBacklogItemService ... { ... @Transactional public void assignTeamMemberToTask( String aTenantId, String aBacklogItemId, String aTaskId, String aTeamMemberId) { BacklogItem backlogItem = backlogItemRepository.backlogItemOfId( new TenantId(aTenantId), new BacklogItemId(aBacklogItemId)); Team ofTeam = teamRepository.teamOfId( backlogItem.tenantId(), backlogItem.teamId()); backlogItem.assignTeamMemberToTask( new TeamMemberId(aTeamMemberId), ofTeam, new TaskId(aTaskId)); } ... }
Having an Application Service resolve dependencies frees the Aggregate from relying on either a Repository or a Domain Service. However, for very complex and domain-specific dependency resolutions, passing a Domain Service into an Aggregate command method can be the best way to go. The Aggregate can then double-dispatch to the Domain Service to resolve references. Again, in whatever way one Aggregate gains access to others, referencing multiple Aggregates in one request does not give license to cause modification on two or more of them.
Cowboy Logic
LB: “I’ve got two points of reference when I’m navigating at night. If it smells like beef on the hoof, I’m heading to the herd. If it smells like beef on the grill, I’m heading home.”
Limiting a model to using only reference by identity could make it more difficult to serve clients that assemble and render User Interface (14) views. You may have to use multiple Repositories in a single use case to populate views. If query overhead causes performance issues, it may be worth considering the use of theta joins or CQRS. Hibernate, for example, supports theta joins as a means to assemble a number of referentially associated Aggregate instances in a single join query, which can provide the necessary viewable parts. If CQRS and theta joins are not an option, you may need to strike a balance between inferred and direct object reference.
If all this advice seems to lead to a less convenient model, consider the additional benefits it affords. Making Aggregates smaller leads to better-performing models, plus we can add scalability and distribution.
Scalability and Distribution
Since Aggregates don’t use direct references to other Aggregates but reference by identity, their persistent state can be moved around to reach large scale. Almost-infinite scalability is achieved by allowing for continuous repartitioning of Aggregate data storage, as explained by Amazon.com’s Pat Helland in his position paper “Life beyond Distributed Transactions: An Apostate’s Opinion” [Helland]. What we call Aggregate, he calls entity. But what he describes is still an Aggregate by any other name: a unit of composition that has transactional consistency. Some NoSQL persistence mechanisms support the Amazon-inspired distributed storage. These provide much of what [Helland] refers to as the lower, scale-aware layer. When employing a distributed store, or even when using a SQL database with similar motivations, reference by identity plays an important role.
Distribution extends beyond storage. Since there are always multiple Bounded Contexts at play in a given Core Domain initiative, reference by identity allows distributed domain models to have associations from afar. When an Event-Driven approach is in use, message-based Domain Events (8) containing Aggregate identities are sent around the enterprise. Message subscribers in foreign Bounded Contexts use the identities to carry out operations in their own domain models. Reference by identity forms remote associations or partners. Distributed operations are managed by what [Helland] calls two-party activities, but in Publish-Subscribe [Buschmann et al.] or Observer [Gamma et al.] terms it’s multiparty (two or more). Transactions across distributed systems are not atomic. The various systems bring multiple Aggregates into a consistent state eventually.