Monday, May 20, 2013

Writable Property Assertions using AutoFixture.Idioms

AutoFixture.Idioms allows us to write unit tests which tend to follow common templates. Mark Seemann calls it Idiomatic unit testing. Previously we have discussed about testing guard clauses using Guard Assertions. In this post we are going to discuss about testing writable properties. Unit testing property assignments is a big part of our unit tests. AutoFixture.Idioms provides WritablePropertyAssertion to test all such properties in a single unit tests. This minimizes the unit testing effort and keeps the unit tests less brittle.

You would need to add the nuget packages for xUnit , AutoFixture and AutoFixture.Idioms as described in the previous post. Let's add a Student class in our project. It has two writable properties StudentId and StudentName.

You might be thinking about separate tests for each writable property in Student. This makes it more difficult as the number of such properties increase. With AutoFixture, we just need one unit test to verify that the properties are being correctly assigned and they reflect the assigned value afterwards. Just see how simple the unit test has become.

Let's assume we have a defect in our code which is causing a property not to be assigned properly. Here we have StudentId always return a fixed value no matter the value assigned to it. This is definitely a defect. Let's update the code as follows:

The unit test should check for such defects and should report. When we run the unit test now, it does fail with the following details:

Sunday, May 19, 2013

Read only Collections in .net framework 4.5

In this post we are going to discuss about some new interfaces implemented by ReadOnlyCollection<T> in .net framework 4.5. We are oing to discuss how they can help us. If you look at the definition of ReadOnlyCollection<T> in the framework, you would notice these additional interfaces being implemented. We discussed about ReadOnlyCollection<T> in our discussion about purely functional data structures. The updated definition of ReadOnlyCollection<T> in .net framework 4.5 is as follows:



You might have seen ReadOnlyCollection<T>(s) as the return type of methods. It is used to provide a readonly wrapper around an IList<T>. Although they are not thread safe if the underlying collection's elements count are being continously updated. They makes it impossible for the outside scope to temper the collection by introducing or removing elements as these operations cause exceptions thrown. You can also use AsReadOnly() provided by List<T> to have the same effect.

In order to return a read only view of the list, we need to create a new instance of ReadOnlyCollection<Studentgt;. This would make sure that any external update can be avoided on the collection. In order to do fulfill a similar requirement in .net framework 4.5, we just need to use IReadOnlyCollection based return type as the return type of the method. Although the user of this method can cast it to the generic List type and update the list, but the intent of the method is very clear that it is just returned for consumption. But there would be no exceptions for such operations.

Basically List<T> is provided with an updated definition in .net framework 4.5. It implements some additional interfaces. One of them is IReadOnlyCollection<T>. Is this semantically correct to implement such interfaces when the the type allows adding and removing items? Don't even worry about this is Implementation we are talking about, not inheritance. Implementing an interface just means the "Can Do" for a type. Let's not confuse the interface implementation with inheritance. List<T>; can be used in all places where a reference of interface type is used including method parameters and return type. Hence Liskov's Substitution Principle still holds true.



Here Student is a simple class with two properties StudentId and StudentName. The class exposes observational immutability. Here the definition of Student can be as follows:

Other Related Interfaces
In the above discussion, we can notice that List<T> also implements IReadOnlyList<T> . Basically .net framework 4.5 also includes some other interfaces including IReadOnlyList<T> and IReadOnlyDictionary<TKey, TValue>. Here the former provides an index support to IReadOnlyCollection<T> and latter can be used to provide a read only view of the dictionary.



You can verify that the definition of Dictionary<T> has also been updated to support IReadOnlyDictionary<TKey, TValue>.



Covariance & New Interface Types
IReadOnlyCollection and IReadOnlyList are covariant collections. As you might remember the feature of covariance and contravariance was introduced in .net framework 4.0. As we know that a type's instance can be assigned to the reference of its base type or interface type based on Polymorphism. Covariance allows a collection of child class to be assigned to a collection of base type. This is defined with out keyword with generic parameter.

In the above code, GetStudents() is supposed to return a collection of Student. But it returns one for GraduateStudent. Here GraduateStudent is a sub-type of Student. This is specially useful for API implementation where a sub-type needs to return the collection of child type of the type argument. This generally comes as a surprise while implementing child classes. The definition of GraduateStudent can be as follows:

Download Code

Friday, May 17, 2013

Arrows, Guard Clauses & Assertions using AutoFixture.Idioms

Procedural languages introduced functions for providing re-usability for a block of code. Object Oriented programming languages inherited the idea and allowed object behaviors to be defined in terms of methods. These methods are called. After doing some house keeping work, the runtime jumps to the callee and executes the code. After it is done execution, it brings the control back to the caller. This call scenario enables the use of try/catch and thread synchronization constructs for any subsequent calls. Based on this call scenario, the control has to come back to the caller after executing the callee. Programming languages also support other scenarios for executing a piece of code. We say that a catch block is executed and not called as we know that control will not go back to the point where exception was thrown, again, exceptions are thrown. There are other scenarios which might cause another scenario e.g. events are fired which cause all subsequent subscribed event handlers to be called.

In order to pass control back to the caller, the callee can explicitly use return statement. There is an implicit return statement at the end of a method with a void return type without an explicit return statement. For a method with a non-void return type, the return statement must be explicit.

Single Entry / Exit Points
Structured Programming suggests that we should only have one return path in a function. This would lead towards a single return statement per method. They are also often referred as exit points. Based on this idea, a method should always have a single entry & exit points. The single entry and exit points provides more readability to the code allowing it very easy to follow through the code and determining the intent and responsibility of the method.

Developers claim it to be bad coding style and coding risk. But it also makes it very difficult to refactor the code with multiple return statement specially when we need to extract some code into a new method from an existing method with convoluted code with multiple return statements. It is also a maintenance nightmare to touch such a code to fix any upcoming issue. Additionally, it is not easy to unit test such a code to cover all cases without the help of a code coverage tool used with your unit test in order to avoid false positives.

Based on this, it seems apparent that multiple return statements are really evil and we should be avoiding them when writing our code. Bob Martin recommended Boy Scout's principle for code changes [You should leave the code cleaner than you find it]. So this can be part of our cleanup exercise to change all methods with multiple return statements into single ones. This should cause sanity to prevail. This also allows us to add any cleanup code necessary before the only return statement in the method.

Arrow antipattern
Applying arrow pattern religiously leads us into another problem. In order to avoid multiple return statements we tend to add multiple if statements in a nested fashion. This gets worse if we have multiple parameters and we need to check for the input arguments for some preconditions. This situation is generally referred to as Arrow antipattern. The code appears as follows:



Arrow antipattern reduces readability because of increased right margins. It also increases complexity of the code by providing various code flow paths which makes unit testing a hell. This page discusses about arrow antipattern and possible refactoring to fix that [http://www.codinghorror.com/blog/2006/01/flattening-arrow-code.html]

Guard Clauses
Fowler recommends using guard clauses for all special cases. His recommendation is conditional behaviors in a method which doesn't have a very clear normal path of execution. These guard clauses can also be used for checking preconditions for a method. A failed precondition generally results in an exception.

Constructors and Dependency Injection
Object oriented languages provide constructors for instantiating a type. They are part of class' metadata and cannot be invoked like a regular method. Mark Seeman suggests constructor injection to be preferred choice in most cases. This requires all dependencies to be specified as constructor arguments.

Testing dependencies in a guard clause is necessary as these dependencies are preconditions for object's inner working. But we need to check for all of these dependencies cause missing any of them should fail the instantiation of type under consideration. In order to verify these guard clauses, we need unit tests. In the above case, we need two unit tests to verify these dependencies if we follow One Assert per Test rule. Increasing the number of dependencies might need more tests.

I have seen people getting more excited and checking the combination of these dependencies but that seems too much as there are generally unique guard clauses for each dependency. In the below code, we are trying to test the above construction logic. Here we are testing that setting any constructor's parameter to null should result in an exception. These tests are also very brittle as introducing a new dependency requires updating all of these tests.

Here we are using xUnit for unit testing. We are using Rhino Mocks for mocking the dependencies.

Code is a liability
Unit tests should follow DRY [Don't Repeat Yourself] principle of Extreme Programming. Their names should follow DAMP (Descriptive And Meaningful Phrases). Introducing any new type involves this arduous work of construction tests. Since we are not introducing a type just because we have fancy construction, but for the logic that it implements, putting effort for writing these tests takes this useful time away from logic testing.

Mark Seemann has provided AutoFixture.Idioms for such unit tests which tend to follow such common templates. This has a dependency on AutoFixture nuget package. Let's install these packages to our test project. We have also installed xUnit package as a unit testing library.



AutoFixture.Idioms provide GuardClauseAssertion to verify that an exception is thrown if a dependency is missing. In the following code we are trying to assert the guard clauses for all the dependencies for all the constructors.

But to test these guard clauses, it needs to pass the required dependencies as constructor arguments. But the dependencies cannot be instantiated as they are interface types. Running the above test fails with the following message:



AutoFixture has provided additional extension packages to support mocking libraries including Moq, FakeItEasy and RhinoMocks. Let's install the package which uses Rhino Mocks. Obviously this package has an additional dependency on Rhino Mocks. This package is called AutoFixture.AutoRhinoMocks.



After installing the above package we can add customization to the Fixture to support Rhino Mocks for mocking the required dependencies. The test would fail if there is a guard clause missing from any constructor for the specified type. This approach reduces the effort for unit testing the code required just for defensive code and we can concentrate on the real application logic.

Testing Guard Clauses for Methods
Methods can also have guard clauses to check for dependencies. We can use AutoFixture idioms to assert these clauses as well. Let' add an additional behavior to the IAuthenticationService method to support adding an authentication policy. We need to add the same method to the AuthenticationService implementation. The method just has one parameter of IAuthenticationPolicy type.

If we are missing the guard clause for null testing of the provided argument, the test would fail. It results in the following in your unit testing session. Here we are using resharper support for xUnit to run tests and show the results.



Adding an guard clause which results in an exception, would follow the expectation. It is shown as green in the session. The test would pass. It is shown in the testing session as follows:



In order to run the above tests, we needed to add a few nuget packages. The names and dependencies of these packages can be checked using Package Visualizer.



Why is this a great achievement?
Just to be able to run this test is a big achievement for software design and development world. Let me tell you what is happening here. Basically this test is a combination of a test runner with a different assertion library with mocking support from another provider. Here we are using XUnit for running our tests. Auto Fixture Idioms are helping us asserting the guard clause checks. In order to create mocks, AutoFixture is using Rhino Mocks. So this has test runner, assertion and auto mocking libraries from different providers. Together they are able to help us achieving this miracle. Isn't this great?

Download Code

Friday, May 10, 2013

Command Query Responsibility Segregation [ CQRS ] - An Introduction

Bertrand Meyer introduced the idea of CQS [Command Query Segregation] in his famous book Object Oriented Software Construction. He discussed about responsibilities of methods of an object. Each method should either be a Command or a Query method. It should not be both. As described here, asking a question should not change the answer. It should just give us the answer back. If you remember, a few months back we discussed about side effects free purely functional approach for designing methods. CQS helps us designing our types and their behaviors. We divide our methods into two categories, ones which modify the state (commands) and others which gives data back (query). Queries are free of side effects. Martin Fowler has also discussed about the approach its limitations for certain scenarios. But this is generally considered as a principle and is widely applicable.

CQRS is about taking CQS to the next level and encompass the whole domain. It is the abbreviation of Command Query Responsibility Segregation. Some people call it pattern, other call it an architectural style. I think we can leave the classification for some other time but there are some scholarly discussions [Distributed Object Computing Group - Washington University] which can help us identify it as one. Based on the point that it doesn't have a documented solution, I am more inclined towards classifying it as an architectural style.

When to use CQRS?
The name was coined by Udi Dahan and Greg Young. CQRS is applicable only on collaborative domains where a large set of users are working on a small set of data. The users interact with each other only this small set of data. There is no feedback by the system as the command wouldn't be processed immediately. We need durable handling of these commands by supporting persistence of these commands to accommodate system failures. The command shouldn't be scrapped until complete processing. There must be a mechanism for handling of duplicate commands.

Like any other architecture styles, CQRS is also not a silver bullet. Adopting CQRS by the system would deprive users of any immediate feedback. So it should be applied very carefully. Dividing systems in commands and queries makes our system highly scalable. It also improves maintainability.

The concepts of CQRS are simple but change in mindset is the most difficult part. So it is always good to come from a DDD mindset. This would open up your mind to design in terms of domain as a whole. Additionally this should help you understanding a lot of text and material put together by CQRS gurus where they reference DDD as it is a pre-requisite. You can also use CQRS on component level instead of the whole domain. So it might not be a top level architecture.

Why to Segregate Commands and Query?
I have just referred Interface Segregation Principle [ISP] an an example of segregation. We can very well use Single Responsibility Principle [SRP] keeping in view of commands and queries as separate responsibilities. See how well SOLID fits in every design and architecture discussion?? :) Let's take the step by step approach to understand CQRS similar to our discussion about Windows Identity Foundation where we argued why it is better to design security around claims.

As we discussed above that CQRS is different than CQS but the definition of commands and queries stay the same. Commands change the state of the system and Queries read the state of the system without causing any side effects.

Significance of Data
For an enterprise's everyday business needs Data plays a significant role. Modern organization use software systems to provide ease of data management. Software systems are designed for satisfying the business needs of our customers. The organizations which can efficiently create the needful data and has ability to process it become successful as they are able to make right decisions at the right time. Yes, knowledge is no more a virtue, it is power now.

These systems have a front-end for client facing. You can read MVVM Survival Guide to further study about designing client applications using MVVM architectural style. The front-end would be making calls to a back-end system. For simpleton applications this backend system might simply be a DataBase Management System. In this case, client applications make necessary calls in the supported query language [SQL]. It might also use DBMS specific database programming features including stored procedures and functions. The choice of DBMS is really important as data is of the highest priority in any system.



Enterprise Operations are the key
These systems work perfectly for simple applications involving CRUD [Create, Read, Update, Destroy] operations. But organizations have complex business processes which require interaction of multiple systems. These business processes have complex workflow and security requirements which don't quite fit in this client / DBMS based approach. Enterprise applications are designed for performing operations by authorized users. It is better to see these systems as a collection of operations and use cases. Back in my procedural programming days I remember saying a program is a set of functions. It is these operations for which we design our applications so our design must focus and built around them. Remember seeing a litany of applications back in earlier days of Information Technology? The moment we get our head away from CRUD approach and start thinking about enterprise operations, we start adding real value to enterprise application infrastructure. It is a paradigm shift and involves a lot of unlearning, which is the most difficult part. This is the core of domain driven design.



Classification of Operations
Building on the idea of command and query separation, we can easily divide enterprise operations into commands and queries. They can be provided to client applications as separate command and query services. Dividing our application as a set of commands enables us to provide Task based implementation. In DDD language, this is Shared Kernel approach for context mapping if we consider command and queries as separate bounded contexts. Both teams must agree on any changes on this shared model. There should be very strict continuous integration based tests to ensure that. Any breaking changes caused by requirement on one side should be planned so that the other team make necessary adjustment to accommodate the change.



Mostly we have different requirements for command and query models. Command models are less forgiving having strict validation requirements. On the other hand, query based models are more denormalized. For reporting purposes, these models also have in-built caching mechanisms generally which is not very common to command based models.

Command and Query Interfaces for Same Model
[Fowler] has discussed the simplest implementation as providing separate command and query interfaces. The model types should be implementing both of these interfaces. You might have seen the same business objects library shared between different projects and applications. They are examples of this implementation. I have seen this enough and this works in most of the situations. This style would be more useful with Dependency Injection where the concrete command services actually depend on the command interfaces, and query services are dependent on query interfaces. The actual model types are injected.



Separate Models for Commands and Queries
As we discussed above, although initially it might seem that we need the same model for both, studying the process in more details mostly suggests that only similar terminologies are being used by different domain experts. They are actually referring to some different concepts. In order to design these different concepts we can build altogether different models. This is the same approach as Separate Ways in Domain Driven design. It is also possible that both models inherit from a Shared Kernel based model for overlapping concepts and then develop their specific implementation on top of that. Doing that would save us from difficulties of planning for changes in shared kernel. Any changes in the domain model, can be planned separately by both teams. These changes should be resulting in model updates for providing a correct complete view of the system.



Scaling Command and Query Services
Based on the particular system we are trying to model, there might be different scaling requirements for commands and queries. It is possible that there are more frequent command operations or otherwise. In order to handle them properly, we can use separate deployment of command and query services. If it is a monitoring application, then we will definitely be getting more status updates (commands) then the queries to determine the state of things being monitored. The separate deployment can very well use the same model if that fits the requirement. In this case, since they are separate applications, we need to find a way to share the model between them. Historically the libraries are shared in a source control repository. While building these applications, we can use the recent version of the library. This sharing can be automated with nuget packages as well.



Or they can use completely different models like separate ways bounded contexts. In our example of monitoring application, the query service would mostly be management information system reports. They might also be complex event processing system designed specifically to ensure immediate action. It might be intensive care units of a hospital which require immediate response of these events.



Separate Reporting Databases
The date generated by commands is more transactional in nature. We might not need the same granularity for query services. Queries often require more aggregated data. This can be denormalized. It is possible to create a separate database for reporting purposes where we hold pre-processed data for data warehousing requirements. The process can run overnight in batches. This is generally read-only data with less historical view for faster queries. The data can be in terms of OLAP cube for easier slicing and dicing.



Related Patterns
There are other related patterns which are generally discussed in the same texts as CQRS. We would try to discuss them in some future post. They are as follows:
  • Event Sourcing
  • Process Manager
  • Eventual Consistency
Further Readings
http://cqrsjourney.github.com
http://pundit.cloudapp.net
http://www.cqrsinfo.com/category/programming/
http://martinfowler.com/bliki/CQRS.html
http://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf

Visual Studio Magazine Article On PCL

I have recently collected and summarized my thoughts into an article on Visual Studio Magazine. You can find the article here: