Event Sourcing and CQRS

Most systems need to persist data. Let us assume that the business logic of the system is represented and executed by the Domain Model. Application persistence focuses on persisting domain objects, like aggregates and entities. Domain Model-oriented persistence optimizes for transactional performance, because Domain Model executes operations that trigger state transitions, according to the logic and rules of the model.

Does this approach create challenges to retrieve data for reporting? Think of reporting to be more than just creating  a PDF or CSV file. Each API call results in a database query that aims to get some data to show to the user or return the result (used in reporting). 

Reporting needs are different from the needs to execute transactional operations. Systems also come with a clear imbalance between the number of writes and reads. For example, a user-facing application has much less writes than reads , while bank back-office operations or vehicle tracking systems have more writes, opposite imbalance.   

The requirements for executing commands and queries are very different in CQRS,  operations that trigger state transitions(commands) and any data retrieval that goes beyond command execution (query). This difference encourages developers use different persistence approach for commands and queries, by segregating them

When a command is successfully executed, the system transitions to a new state, The command flow in CQRS goes hand in hand with the Task-Based User Interface pattern. A task-based UI makes each operation explicit, like Check Out, Add Item or Cancel Order, making user operations to be easily transalated to commands( When the UI sends via API to the Domain Model).

When a query is successfully executed, the system gets  data from the persistent store and returns data to be displayed to the user, sent to another system or used for any other purpose. Queries do not involve the Domain Model because queries do not execute any operations and should not contain any business logic. Queries have no side effects and are idempotent, (how many times the query gets executed,  it will always return the same result, unless the system state has changed in the meantime)

Developers don’t need to follow the same way to access data during query and to persist domain objects. For example, if the Domain Model persistence uses some ORM framework that distributes state of domain objects across tables in a relational database, a query can be just an SQL statement that ignores the ORM and gets the data directly from those tables.

CQRS suggests that queries target specific use cases that return a pre-composed data set, that shall be displayed in its entirety on the screen or in a part of the screen. While one can avoid using the domain model for query in a state-based persistence, such an approach may become hard or impossible for use in event-sourced systems, as there is no place to store  the domain object’s state in its entirety.

Domain entities in event-sourced systems are stored as event streams, each entity is a sequence of events from the persistence point of view. Domain events in event stores alone do not allow reconstructing the entity state without knowing the logic that the entity uses to recreate its own state from events and this logic stays in the code of the Domain Model.

One approach is to  project events to an alternative store, suited for easy querying. This store can be a relational or document database, or cache, or any persistence medium, applied for a specific use case.  Projection software component that subscribes to the live event feed of events database, when it receives an event, it projects the information in that event to a query model in a dedicated read DB..

Think of projection as a representation of an object using a different perspective. For example,3D object on paper can have isometric and orthographic projections to provide different points of view. Projecting data means representing data differently from how the data was stored originally. For example, a relational database view is a stateless projection that doesn’t change the original data.

In CQRS, query side is a projection. Specific to projection, applications are required to keep the state of read models in a queryable store, as it is a hard challenge to query event streams in ad-hoc fashion.

Write model scope : Go Deep

While state transitions lead to events corresponding to that single entity, projections can choose to have a larger role to play beyond processing events of a single entity. Projections can assemble and aggregate data for multiple entities, even for different types of entities.

Every time the query model processed query in the  application, developers must ensure that information stored in the read model is up-to-date. Hence, applications need to establish a real-time connection to the event store, so the projection receives events immediately after they are stored.

When an application hosting subscription eventually stops and then starts again, the subscription will start catching up from the first event in the stream again. This defeats the purpose of having the read model state persisted in a database. For a system that wants to re-project all the events each time the projection starts, can the application keep the read model in memory? Think how much time is required to re-add all the events every time a restart occurs.  

Other than re-projecting the history all over again, Can application  store the event offset (a position the event in the stream) after projecting the event. This approach lets the system load the stored checkpoint when the application starts again, so the application can subscribe from the last known position instead of the stream start.

Ensure that a two-phase commit issue does not happen. Here the two-phase commit issue means that only the first (read model update) is executed and execution of the second(store the checkpoint) fails due to some transient failure. One can overcome this by either using a database transaction to wrap both operations or by making projections idempotent, (apply the same event twice won’t bring the read model to an invalid state).

Be prepared that projection code needs to handle any event that comes from the domain model  and build sophisticated read models that can serve a variety of needs.

Let the application display a page to the user that contains order details, including payment and shipping information. Let us assume that the domain model handles them as part of  different aggregates with an independent life cycle, corresponding to a different aggregate.

While front-end approach is to invoke multiple API endpoints to collect data from different parts of the system, event sourcing offers another option to build a read model that represents all the information present in that page, which removes performing complex data composition in the UI. To achieve this goal, the projection needs to receive events from different streams.

For each order, the projection creates a read model and projects both order information and payment information to it. This implies that the subscription that feeds the projection with events has to subscribe to a stream that contains all events from entities of different types.

In the Event Store, the concept of individual streams builds on top of the single event sequence, which is called the $all stream. Streams with names starting with $ are considered system streams, but it doesn’t mean you can’t use them.

Each new event gets appended to the global event sequence. The stream name for that event serves for event indexing, so applications can read a subset of events by using the stream name. However, the stream name doesn’t tell Event Store where to persist the event, and all the events go to the global append-only store. With this internal structure of persistence , Event Store allows subscribing to the global stream of events.

All events in the global event stream are ordered and any subscription that uses the $all stream will get events in the same order as they were written to Event Store, even when events are appended to different logical streams. That way projection code is reassured that things that must happen in order will do so, until the write-side of the application behaves correctly.

Read model scope : Go Deep

To access the  full power of CQRS with Event Sourcing, developers need to resist the urge to  build projections that build the current entity state as a read model. This approach has its origin of persisting domain objects in traditional databases, where one checks the current state of any domain object, at any time by looking into the database. 

The pros of having an event-sourced system is the ability to create new read models as they are required at any time, without impacting anything else. For example, when the read models in MongoDB do not fit the requirement of the full-text search , a new projection targeting ElasticSearch with a limited set of fields required by the full-text search function can be developed.

Yes, CQRS makes it possible to build new aggregations or sets of denormalized data, and make them available as  pre-calculated in the UI for display, instead of  running expensive queries every time a user opens that page.

For example, eCommerce websites require a page to display all the orders for one single customer. While one can build an API endpoint to run a query for that purpose, please be aware that each query potentially introduces side effects on the database, like space used for indexes, degraded performance. Some databases may not allow you to query without changing the persistence model. In these scenarios, Building a new read model may be simple and straightforward.

Leave a comment