Friday, May 10, 2013

Command Query Responsibility Segregation [ CQRS ] - An Introduction

Bertrand Meyer introduced the idea of CQS [Command Query Segregation] in his famous book Object Oriented Software Construction. He discussed about responsibilities of methods of an object. Each method should either be a Command or a Query method. It should not be both. As described here, asking a question should not change the answer. It should just give us the answer back. If you remember, a few months back we discussed about side effects free purely functional approach for designing methods. CQS helps us designing our types and their behaviors. We divide our methods into two categories, ones which modify the state (commands) and others which gives data back (query). Queries are free of side effects. Martin Fowler has also discussed about the approach its limitations for certain scenarios. But this is generally considered as a principle and is widely applicable.

CQRS is about taking CQS to the next level and encompass the whole domain. It is the abbreviation of Command Query Responsibility Segregation. Some people call it pattern, other call it an architectural style. I think we can leave the classification for some other time but there are some scholarly discussions [Distributed Object Computing Group - Washington University] which can help us identify it as one. Based on the point that it doesn't have a documented solution, I am more inclined towards classifying it as an architectural style.

When to use CQRS?
The name was coined by Udi Dahan and Greg Young. CQRS is applicable only on collaborative domains where a large set of users are working on a small set of data. The users interact with each other only this small set of data. There is no feedback by the system as the command wouldn't be processed immediately. We need durable handling of these commands by supporting persistence of these commands to accommodate system failures. The command shouldn't be scrapped until complete processing. There must be a mechanism for handling of duplicate commands.

Like any other architecture styles, CQRS is also not a silver bullet. Adopting CQRS by the system would deprive users of any immediate feedback. So it should be applied very carefully. Dividing systems in commands and queries makes our system highly scalable. It also improves maintainability.

The concepts of CQRS are simple but change in mindset is the most difficult part. So it is always good to come from a DDD mindset. This would open up your mind to design in terms of domain as a whole. Additionally this should help you understanding a lot of text and material put together by CQRS gurus where they reference DDD as it is a pre-requisite. You can also use CQRS on component level instead of the whole domain. So it might not be a top level architecture.

Why to Segregate Commands and Query?
I have just referred Interface Segregation Principle [ISP] an an example of segregation. We can very well use Single Responsibility Principle [SRP] keeping in view of commands and queries as separate responsibilities. See how well SOLID fits in every design and architecture discussion?? :) Let's take the step by step approach to understand CQRS similar to our discussion about Windows Identity Foundation where we argued why it is better to design security around claims.

As we discussed above that CQRS is different than CQS but the definition of commands and queries stay the same. Commands change the state of the system and Queries read the state of the system without causing any side effects.

Significance of Data
For an enterprise's everyday business needs Data plays a significant role. Modern organization use software systems to provide ease of data management. Software systems are designed for satisfying the business needs of our customers. The organizations which can efficiently create the needful data and has ability to process it become successful as they are able to make right decisions at the right time. Yes, knowledge is no more a virtue, it is power now.

These systems have a front-end for client facing. You can read MVVM Survival Guide to further study about designing client applications using MVVM architectural style. The front-end would be making calls to a back-end system. For simpleton applications this backend system might simply be a DataBase Management System. In this case, client applications make necessary calls in the supported query language [SQL]. It might also use DBMS specific database programming features including stored procedures and functions. The choice of DBMS is really important as data is of the highest priority in any system.



Enterprise Operations are the key
These systems work perfectly for simple applications involving CRUD [Create, Read, Update, Destroy] operations. But organizations have complex business processes which require interaction of multiple systems. These business processes have complex workflow and security requirements which don't quite fit in this client / DBMS based approach. Enterprise applications are designed for performing operations by authorized users. It is better to see these systems as a collection of operations and use cases. Back in my procedural programming days I remember saying a program is a set of functions. It is these operations for which we design our applications so our design must focus and built around them. Remember seeing a litany of applications back in earlier days of Information Technology? The moment we get our head away from CRUD approach and start thinking about enterprise operations, we start adding real value to enterprise application infrastructure. It is a paradigm shift and involves a lot of unlearning, which is the most difficult part. This is the core of domain driven design.



Classification of Operations
Building on the idea of command and query separation, we can easily divide enterprise operations into commands and queries. They can be provided to client applications as separate command and query services. Dividing our application as a set of commands enables us to provide Task based implementation. In DDD language, this is Shared Kernel approach for context mapping if we consider command and queries as separate bounded contexts. Both teams must agree on any changes on this shared model. There should be very strict continuous integration based tests to ensure that. Any breaking changes caused by requirement on one side should be planned so that the other team make necessary adjustment to accommodate the change.



Mostly we have different requirements for command and query models. Command models are less forgiving having strict validation requirements. On the other hand, query based models are more denormalized. For reporting purposes, these models also have in-built caching mechanisms generally which is not very common to command based models.

Command and Query Interfaces for Same Model
[Fowler] has discussed the simplest implementation as providing separate command and query interfaces. The model types should be implementing both of these interfaces. You might have seen the same business objects library shared between different projects and applications. They are examples of this implementation. I have seen this enough and this works in most of the situations. This style would be more useful with Dependency Injection where the concrete command services actually depend on the command interfaces, and query services are dependent on query interfaces. The actual model types are injected.



Separate Models for Commands and Queries
As we discussed above, although initially it might seem that we need the same model for both, studying the process in more details mostly suggests that only similar terminologies are being used by different domain experts. They are actually referring to some different concepts. In order to design these different concepts we can build altogether different models. This is the same approach as Separate Ways in Domain Driven design. It is also possible that both models inherit from a Shared Kernel based model for overlapping concepts and then develop their specific implementation on top of that. Doing that would save us from difficulties of planning for changes in shared kernel. Any changes in the domain model, can be planned separately by both teams. These changes should be resulting in model updates for providing a correct complete view of the system.



Scaling Command and Query Services
Based on the particular system we are trying to model, there might be different scaling requirements for commands and queries. It is possible that there are more frequent command operations or otherwise. In order to handle them properly, we can use separate deployment of command and query services. If it is a monitoring application, then we will definitely be getting more status updates (commands) then the queries to determine the state of things being monitored. The separate deployment can very well use the same model if that fits the requirement. In this case, since they are separate applications, we need to find a way to share the model between them. Historically the libraries are shared in a source control repository. While building these applications, we can use the recent version of the library. This sharing can be automated with nuget packages as well.



Or they can use completely different models like separate ways bounded contexts. In our example of monitoring application, the query service would mostly be management information system reports. They might also be complex event processing system designed specifically to ensure immediate action. It might be intensive care units of a hospital which require immediate response of these events.



Separate Reporting Databases
The date generated by commands is more transactional in nature. We might not need the same granularity for query services. Queries often require more aggregated data. This can be denormalized. It is possible to create a separate database for reporting purposes where we hold pre-processed data for data warehousing requirements. The process can run overnight in batches. This is generally read-only data with less historical view for faster queries. The data can be in terms of OLAP cube for easier slicing and dicing.



Related Patterns
There are other related patterns which are generally discussed in the same texts as CQRS. We would try to discuss them in some future post. They are as follows:
  • Event Sourcing
  • Process Manager
  • Eventual Consistency
Further Readings
http://cqrsjourney.github.com
http://pundit.cloudapp.net
http://www.cqrsinfo.com/category/programming/
http://martinfowler.com/bliki/CQRS.html
http://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf

No comments: