CQRS – my understanding

CQRS stands for Command Query Responsibility Segregation. To put simply,  an application uses a different model to update information that is different from  the model used to read information. While this separation can be valuable, this can also  increase complexity and risk.

Monolithic applications come with a single database that responds to both complex join queries and CRUD operations. Let us stop treating system interactions only as storing and retrieving records based on the model(called CRUD datastore) that is used to enable applications to create new records, read records, update existing records, and delete records. 

If an application query requires more than 10 tables, the database can get locked due to latency of query computation. During writing to a database, performing crud operations may demand complex validations and process long running business rules, leading to databases getting locked.

On the read end, applications can tend to move away from simple CRUD datastore when  multiple records are collapsed into one, or virtual records need to be formed by combining data from different places. On update end, there are validation rules that constraints only certain permutations of data to be stored.

CQRS introduces a split of the domain conceptual model into separate models for update and display,  based on the understanding that in some scenarios the same conceptual model used  for commands and queries leads to a complex model that may neither work well

Separate models also mean different object models executing as part of different logical systems, may share the same database or may use separate databases. Is  the query-side’s database becoming a real-time ReportingDatabase? If yes, some interactions are required between the two models or their databases. It is possible that two models may not be separate object models and just have different interfaces for command and query side.

CQRS is useful in some places, not in others. Recommend to use CQRS only on specific portions of a system (a BoundedContext in DDD) and not the system as a whole. Here each Bounded Context needs its own decisions on how it should be modeled. Only minor cases may have suitability for the CQRS model.  Is there sufficient overlap between the command and query that sharing a model is easy?  Do not use CQRS on a domain that will increase complexity, leading to decrease in productivity and increase in risks.

When there is a big disparity in the number of read and write operations in an application, the CQRS pattern can become handy. This benefit can be a boon to handle high performance applications, as CQRS allows to separate the load from reads and writes allowing them to scale each independently. CQRS is a good pattern to use, when there is  rich and complex business logic

While CQRS is a good pattern to hold in the toolbox, the pattern may not be easy to use well and it can easily chop off important bits if you mishandle it.

CQRS patterns does not propose any frameworks or tools and simply states that  domain write model and domain read model should be separated. For the write side, applications use a command that returns no result. For the read side,  applications use a query with no side effects and returns a result:

  • Commands (Write side) are used to “tell” write side to perform actions which have side-effects and return no result. Commands can be processed both async((eg. queue) and in sync approach. 
  • Query ( Read side) is used to fetch data from the read side. Queries always return results and can’t mutate state or have side effects. 
  • Read side can implement denormalized materialized views or NoSQL table, the write side can implement an event store or simple table, depending on scaling needs.

For example: Change product status to active,  is a command that gets  invoked from UI as a task. ( say called Task based UI) and the query is something like: Get all active products.

Recommended to understand CAP theorem, which demands trade-offs from the CQRS pattern.  CAP theorem states “For a distributed data store, it is impossible to achieve more than two of the following:

  • Consistency — Every read receives the most recent write or an error
  • Availability — Every request receives a (non-error) response, without the guarantee that it contains the most recent write
  • Partition tolerance — The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes

Depending on what gets scaled in the CQRS system, Consistency or Availability needs to be sacrificed(or both) to achieve better scalability. Let  both business users and engineers understand that trade-offs in CQRS can affect user experience.

Be ready to use separate classes for the read and write side because eventually, you may want to modify each side independently. Recommend to implement them separately from the start.

Queries are not per object, they may be  per view basis. For example, in ecommerce website with a customer account, application wants to display

  • a list of accounts
  • account details with confidential info (e.g. credit card details)
  • account details without confidential info

In the traditional CRUDStore approach, the above requirement would lead to three queries, one for every view. (let me not go to things like lazy loading). The application requires some piece of data for one view and does not require a specific piece of data for another view.

Commands are not per object as well. There may be a command for every behavior. Like OpenAccount, CloseAccount, MergeAccounts, etc.

Here is an example of the domain model object representing an account. When a service wants to query for an account, this is the model it will expect in return 

Example 1. Account aggregate

{

  "createdAt": 1481351048967,

  "lastModified": 1481351049385,

  "userId": 1,

  "accountNumber": "123456",

  "defaultAccount": true,

  "status": "ACCOUNT_ACTIVE"

}

What if the service wants to update the status to ACCOUNT_SUSPENDED? Normally this could be a simple update to the domain object for the status field. Now, what happens when an application uses a domain event to update the status instead? Since a domain object is structurally different from an event, we will need an API that accepts a different model as a command. Here is a domain event that transitions account state from ACCOUNT_ACTIVE to ACCOUNT_SUSPENDED.

Example 2. Account event

{

  "createdAt": 1481353397395,

  "lastModified": 1481353397395,

  "type": "ACCOUNT_SUSPENDED",

  "accountNumber": "123456"

}

To process the domain event and apply the update to the query model, we must have an API to accept the command. The command will contain the model of the domain event and use it to process the update to the account’s query model.

If client write requests are high, the write side can be scaled using Message Queue. Commands are sent to the queue and clients get the immediate response back. The question to ask before placing command on the queue is whether the client and the application jointly validate the command as best it can, to minimize chance of error later during  processing

Acknowledgement of  a command means that the command is saved into a queue and does not mean that the command is processed. For example, when one  buys on a retail site,  a command is issued, validated and a response is sent back. The item becomes sold, but the actual transaction to deduct the amount of money is executed later..Effectively, it increases scalability at a loss of availability.

Scaling the read side means introducing eventual consistency. In practice, this is usually done by creating denormalized read models which are populated from the write side.  To achieve this, some common approaches are polling agent and publisher/subscriber pattern.

Polling agent, a component that pools event stores(ES) for changes. If new events are detected all subscribing projections(read models) are updated, hence called the pull model.

Polling agents give more control, and can also guarantee at most once delivery in order, when polling agent runs as part of a single threaded service that is hosted on one machine at a time (requires to read and send events in order) and this implies that polling agent is less scalable.

Publisher/subscriber pattern can be used with a queue(like service bus) where our projections subscribe to certain events. Events are published as soon as they arrive in the write model. In this case, everything is handled by the queue, hence called the push model. Applications implement Queue using available off the shelf queue. This approach offers less control over events distribution and more calling write side

Leave a comment