Aggregates: one piece of code design
Most tips for code design exist to keep a lid on the exploding complexity of our systems.
I won’t explain all the details of an Aggregate here, but I will say this…
An Aggregate is:
- a concept from Domain Driven Design (DDD).
- a cluster of associated objects that we treat as a unit for the purposes of data changes.
- a mechanism for simplifying the inter-object relationships and complexity.
- a container that enforces the invariants on the data it holds.
- globally identified only by its root.
- opaque, all actions must go through the root of the Aggregate.
Aggregates are hierarchical
Aggregates are an important mechanism to tame the complexity inside a system by simplifying and limiting the traversal and number of relationships between objects.
Aggregates are inherently hierarchical. There’s a ROOT, this root is the global identity of the Aggregate. The Aggregate can hold relationships to other objects. The other objects may be simple value-objects or may be other entities, or other Aggregates.
If other objects want to hold a reference to the Aggregate, that reference must be to the Aggregate’s root.
Aggregates have a root, and hold other Aggregates. You can only access an Aggregate via its root. This makes Aggregates hierarchical.
Our design patterns and tools are hierarchical
Many of the tools and design patterns that we use are hierarchical.
A good example of this is a REST API. A typical SaaS REST API could be design like this:
/workspace/{workspace_id}/members/{member_id}/documents/{document_id}/comments
Our REST API expresses the hierarchical design of our data.
workspaces
containmembers
members
havedocuments
documents
havecomments
This API is for comments
, but it leverages the hierarchy of our data to simplify the actions and operations.
Each of the entities in the API are likely Aggregates, and each holds other Aggregates. Our ROOT Aggregate is workspace
.
We know that if our workspace
is deleted then we should delete all the entities that are contained in that workspace
.
We could flatten our API into:
/comments?document_id={document_id}
This might seem more convenient for the comments API, but by doing so we loose the hierarchy of the data, and the context within which the comment exists.
With this design for our comments API, we’d be tempted to follow a similar design for workspaces
, members
and documents
.
/workspaces/{id}
/members?workspace_id={workspace_id}
/documents?member_id={member_id}
We’re pretending that all our entities (workspaces
, members
, documents
) are of similar value/importance.
We’re pretending they should all live at the root of our hierarchy.
As we start to add requirements, we find that every entity needs to hold references to most other entities above it in the hierarchy.
action | requirement |
---|---|
List documents for workspace | the document must know the parent workspace |
List members for workspace | the member must know the parent workspace |
List documents for member | the document must know the parent member |
List comments for document | the comment must know the parent document |
List comments for member | the comment must know the parent member |
So.. instead of a nice hierarchical data structure of workspace
-> members
-> documents
-> comments
we have:
comments (workspace, member, document)
documents (workspace, members)
workspaces
By flattening our API, we’ve fully inverted the data hierarchy, making all the lower entities know about the upper ones. I wrote about this exact problem in the post: Data modelling in SaaS apps.
Now, when we revisit our example of “delete the workspace”, we find that enforcing the data invariants is much harder. We have to visit every individual entity and delete it given a workspace. And what happens if the workspace is deleted? Who or how are the members, documents, and comments deleted?
By flattening our API we’re also tempted into splitting these entities across different ‘micro’ services. This splitting of data into entity-CRUD services makes enforcing the data invariants, and propagating the operations like delete, much harder.
Designing good Aggregates is high leverage
Leverage is getting a large effect with a small amount of effort. If you’re going to spend a small amount of time/effort; please spend your time designing Aggregates.
Many features, issues, problems you’ll encounter later become easier when you have good Aggregate boundaries, and hierarchical data design.
We’ll find that to use our tools and design patterns well – like REST APIs – we have to design and enforce good Aggregates and boundaries. Successful design compounds to make our jobs easier, and bad design compounds to make our jobs harder. The ‘root’ of good design lies in Aggregates.