Wednesday, May 29, 2013

Processing Pipelines with TPL Dataflow

Pipelining is a very well known design pattern. It is used when a stream of data elements are processed through a series of pre-determined steps where a output of one serves as input for other step. These steps might also be conditionally linked. The concept is similar to assembly line used in factories where a stream of raw material is continuously added which goes through a series of steps to a finished product. This is commonly used in manufacturing plants. It must be remembered that pipelining has no effect on the processing of a single element (responsiveness) but it is phenomenal to improve the throughput of the overall system.

Let's consider a simple dataflow pipeline where a data element is passed through square, offset and doubling stages. The output of each step is used as input of consecutive following step. A list of element needs to be passed through these stages which would result in a resulting output list with processed elements.



In order to abstract the implementation of the processor, let us introduce an implementation. We will be providing different implementation of the processor and discuss the details about them.

Linear Processor
This is the simplest implementation of the processing pipeline. This divides the processing steps in different methods. We apply the same series of operations on all elements in loop. We also keep adding the resulting elements in a separate collection. When the loop finishes execution, we have our desired results in the collection, which is the returned.

Although this is the simplest and easiest of all data flow pipeline implementation but it doesn't make use of any parallelism in the code. The time to execute all elements in the pipeline is linearly proportional to the size of input collection. So if we assign x, y and z time units to execute square, offset and double operations than to process n elements it would require n(x + y + z) time units.

For more complex steps, we might want to refactor our code by extracting the implementation of these operations into separate type. This would make unit testing these steps a lot easier.

Parallel Loop Processor
Based on the recommendations from Pattern & Practices about usage of Parallel Loop pattern, it is suitable when we need to apply independent operations to a list of elements. This fits to our requirement where we need to apply a series of independent operations to a list of elements. These independent operations are Square, Offset and Double. For simplicity sake, we are not discussing cancellation and exception handling for these loops. Here we have also made use of System.Collections.Concurrent.ConcurrentBag<T>. This type was introduced in .net framework 4.0. This is one of the implementation of IProducerConsumerCollection<T> also introduced in the same framework version. ConcurrentBag has introduced a shared state but this is a thread safe type.

The only caveat is actually a feature of Parallel loop pattern. We cannot determine the actual order of execution of elements. We could verify that by looking at the returned data from Process() method. In this case, it seems to be a requirement that order of result should be the same as the input collection. So the result from this processor would be logically wrong.

Pipeline Processor
As we discussed above, we cannot use Parallel loop pattern when the order of elements in the result matters. Pattern & Practice team recommends another pattern for such requirement. This pattern is called Pipelining. The pattern is used in situations where the elements of a collection passes through a series of steps where output of one step serves as input to the next step. This pattern improves the throughput of multi-core machines by distributing the steps to multiple cores.



Pipelining requires us to update the processing elements to use producer consumer based collections. Here each step produces elements which are then consumed by the next processing element in the pipeline. Producer / Consumer based collections in the processing pipeline make each step independent of immediate previous and next element. The time to execute a list of operations through a pipeline is over-shadowed by the slowest element in the processing pipeline.

In the following example, we have used an implementation of IProducerConsumerCollection<T> i.e. BlockingCollection<T>. This was also introduced in .net framework 4.0. This provides an ideal implementation for producer consumer scenarios. Here consumers can wait using GetConsumingEnumerable() method of the collection. As the elements are added to the collection, they are enumerated by consumers. This also supports notifying the consumers about the completion of the producer / consumer scenario. In this case, no more element can be added to the collection. After enumerating the existing elements in the collection, the enumeration ends. It is a very common bug to forget calling GetConsumingEnumerable methods and directly iterating through the collection. Even I made the same mistake when I was writing the below code. In that case, the loop finishes if there are no current element in the collection.

As we finish processing all elements of input collection, we can notify the consumers by calling CompleteAdding method of BlockingCollection<T>. The consumers waiting on the collection finish execution of the elements in the collection and complete iterating the collection. In the case of processing pipeline, we need to call the same operation on the output collection for the processing element as the next consecutive processing element is waiting for elements in this collection. In this way, the elements in the processing pipeline pass the signal for execution completion. We can return the resulting collection to the requester now. We need to wait until the last element in the processing pipeline finishes execution. Since all the processing elements are implemented through task we can just call Wait on the last task in the pipeline to wait for finishing the execution of processing pipeline.

Dataflow Block Processor
Microsoft introduced Tpl Data flow to make it easier to construct data flow pipelines and networks. This hasn't been released as part of .net framework yet but you can get it as a nuget package.



After installing the package you should see the following reference being added to assembly references section.



There are three main interfaces in TPL Dataflow library. They are ISourceBlock, ITargetBlock and IPropagatorBlock. All blocks are inherited from IDataflowBlock. This becomes very useful while chaining of different blocks to construct a processing pipeline. Any source block can be linked to another Data flow block. In case of multiple destination blocks, it can route message to whoever picks it first. This is common to producer / consumer scenario. It also supports Broadcast block which could route message to multiple destination blocks.



From the coding perspective, Dataflow blocks based approach seems to be the middle ground between Linear Process and Pipeline processors discussed above. We don't need to create separate tasks manually and use Blocking collection, so it is nearly as simple as linear processor. It supports producer / consumer scenario by incorporating internal input and output queues which makes pipelining possible. We can implement our dataflow pipeline as in the following diagram:



We need a combination block to keep adding the results in a collection. When the processing pipeline finishes execution, it just returns the consolidated results. Source and Propagation blocks are a lot like observable collections where the next element is passed to the recipient without the recipients needing to ask for this.

In the above code, we are constructing the whole pipeline in the constructor of the processor. If the data flow blocks are pure in nature, we can even do that in a static constructor keeping the blocks as class members. As depicted in the previous image, we have created and linked four blocks to square, offset, double and consolidate the results.

We also need to chain the completion signal. As the first block receives the completion signal, the message is propagated in the data flow pipeline. Calling CompleteAdding() on a dataflow block means that there are no more blocks expected. After finishing executing the already existing elements in the queue, it can pass the message to the next block in the pipeline (if configured). At the end, we are waiting so that the last element in the pipeline finishes processing by using IDataflowBlock.Completion . We are just returning the results after that.

Dataflow Blocks also support handling exceptions and cancellation during the block execution. Let's keep that for a later discussion.

Download

Monday, May 20, 2013

Writable Property Assertions using AutoFixture.Idioms

AutoFixture.Idioms allows us to write unit tests which tend to follow common templates. Mark Seemann calls it Idiomatic unit testing. Previously we have discussed about testing guard clauses using Guard Assertions. In this post we are going to discuss about testing writable properties. Unit testing property assignments is a big part of our unit tests. AutoFixture.Idioms provides WritablePropertyAssertion to test all such properties in a single unit tests. This minimizes the unit testing effort and keeps the unit tests less brittle.

You would need to add the nuget packages for xUnit , AutoFixture and AutoFixture.Idioms as described in the previous post. Let's add a Student class in our project. It has two writable properties StudentId and StudentName.

You might be thinking about separate tests for each writable property in Student. This makes it more difficult as the number of such properties increase. With AutoFixture, we just need one unit test to verify that the properties are being correctly assigned and they reflect the assigned value afterwards. Just see how simple the unit test has become.

Let's assume we have a defect in our code which is causing a property not to be assigned properly. Here we have StudentId always return a fixed value no matter the value assigned to it. This is definitely a defect. Let's update the code as follows:

The unit test should check for such defects and should report. When we run the unit test now, it does fail with the following details:

Sunday, May 19, 2013

Read only Collections in .net framework 4.5

In this post we are going to discuss about some new interfaces implemented by ReadOnlyCollection<T> in .net framework 4.5. We are oing to discuss how they can help us. If you look at the definition of ReadOnlyCollection<T> in the framework, you would notice these additional interfaces being implemented. We discussed about ReadOnlyCollection<T> in our discussion about purely functional data structures. The updated definition of ReadOnlyCollection<T> in .net framework 4.5 is as follows:



You might have seen ReadOnlyCollection<T>(s) as the return type of methods. It is used to provide a readonly wrapper around an IList<T>. Although they are not thread safe if the underlying collection's elements count are being continously updated. They makes it impossible for the outside scope to temper the collection by introducing or removing elements as these operations cause exceptions thrown. You can also use AsReadOnly() provided by List<T> to have the same effect.

In order to return a read only view of the list, we need to create a new instance of ReadOnlyCollection<Studentgt;. This would make sure that any external update can be avoided on the collection. In order to do fulfill a similar requirement in .net framework 4.5, we just need to use IReadOnlyCollection based return type as the return type of the method. Although the user of this method can cast it to the generic List type and update the list, but the intent of the method is very clear that it is just returned for consumption. But there would be no exceptions for such operations.

Basically List<T> is provided with an updated definition in .net framework 4.5. It implements some additional interfaces. One of them is IReadOnlyCollection<T>. Is this semantically correct to implement such interfaces when the the type allows adding and removing items? Don't even worry about this is Implementation we are talking about, not inheritance. Implementing an interface just means the "Can Do" for a type. Let's not confuse the interface implementation with inheritance. List<T>; can be used in all places where a reference of interface type is used including method parameters and return type. Hence Liskov's Substitution Principle still holds true.



Here Student is a simple class with two properties StudentId and StudentName. The class exposes observational immutability. Here the definition of Student can be as follows:

Other Related Interfaces
In the above discussion, we can notice that List<T> also implements IReadOnlyList<T> . Basically .net framework 4.5 also includes some other interfaces including IReadOnlyList<T> and IReadOnlyDictionary<TKey, TValue>. Here the former provides an index support to IReadOnlyCollection<T> and latter can be used to provide a read only view of the dictionary.



You can verify that the definition of Dictionary<T> has also been updated to support IReadOnlyDictionary<TKey, TValue>.



Covariance & New Interface Types
IReadOnlyCollection and IReadOnlyList are covariant collections. As you might remember the feature of covariance and contravariance was introduced in .net framework 4.0. As we know that a type's instance can be assigned to the reference of its base type or interface type based on Polymorphism. Covariance allows a collection of child class to be assigned to a collection of base type. This is defined with out keyword with generic parameter.

In the above code, GetStudents() is supposed to return a collection of Student. But it returns one for GraduateStudent. Here GraduateStudent is a sub-type of Student. This is specially useful for API implementation where a sub-type needs to return the collection of child type of the type argument. This generally comes as a surprise while implementing child classes. The definition of GraduateStudent can be as follows:

Download Code

Friday, May 17, 2013

Arrows, Guard Clauses & Assertions using AutoFixture.Idioms

Procedural languages introduced functions for providing re-usability for a block of code. Object Oriented programming languages inherited the idea and allowed object behaviors to be defined in terms of methods. These methods are called. After doing some house keeping work, the runtime jumps to the callee and executes the code. After it is done execution, it brings the control back to the caller. This call scenario enables the use of try/catch and thread synchronization constructs for any subsequent calls. Based on this call scenario, the control has to come back to the caller after executing the callee. Programming languages also support other scenarios for executing a piece of code. We say that a catch block is executed and not called as we know that control will not go back to the point where exception was thrown, again, exceptions are thrown. There are other scenarios which might cause another scenario e.g. events are fired which cause all subsequent subscribed event handlers to be called.

In order to pass control back to the caller, the callee can explicitly use return statement. There is an implicit return statement at the end of a method with a void return type without an explicit return statement. For a method with a non-void return type, the return statement must be explicit.

Single Entry / Exit Points
Structured Programming suggests that we should only have one return path in a function. This would lead towards a single return statement per method. They are also often referred as exit points. Based on this idea, a method should always have a single entry & exit points. The single entry and exit points provides more readability to the code allowing it very easy to follow through the code and determining the intent and responsibility of the method.

Developers claim it to be bad coding style and coding risk. But it also makes it very difficult to refactor the code with multiple return statement specially when we need to extract some code into a new method from an existing method with convoluted code with multiple return statements. It is also a maintenance nightmare to touch such a code to fix any upcoming issue. Additionally, it is not easy to unit test such a code to cover all cases without the help of a code coverage tool used with your unit test in order to avoid false positives.

Based on this, it seems apparent that multiple return statements are really evil and we should be avoiding them when writing our code. Bob Martin recommended Boy Scout's principle for code changes [You should leave the code cleaner than you find it]. So this can be part of our cleanup exercise to change all methods with multiple return statements into single ones. This should cause sanity to prevail. This also allows us to add any cleanup code necessary before the only return statement in the method.

Arrow antipattern
Applying arrow pattern religiously leads us into another problem. In order to avoid multiple return statements we tend to add multiple if statements in a nested fashion. This gets worse if we have multiple parameters and we need to check for the input arguments for some preconditions. This situation is generally referred to as Arrow antipattern. The code appears as follows:



Arrow antipattern reduces readability because of increased right margins. It also increases complexity of the code by providing various code flow paths which makes unit testing a hell. This page discusses about arrow antipattern and possible refactoring to fix that [http://www.codinghorror.com/blog/2006/01/flattening-arrow-code.html]

Guard Clauses
Fowler recommends using guard clauses for all special cases. His recommendation is conditional behaviors in a method which doesn't have a very clear normal path of execution. These guard clauses can also be used for checking preconditions for a method. A failed precondition generally results in an exception.

Constructors and Dependency Injection
Object oriented languages provide constructors for instantiating a type. They are part of class' metadata and cannot be invoked like a regular method. Mark Seeman suggests constructor injection to be preferred choice in most cases. This requires all dependencies to be specified as constructor arguments.

Testing dependencies in a guard clause is necessary as these dependencies are preconditions for object's inner working. But we need to check for all of these dependencies cause missing any of them should fail the instantiation of type under consideration. In order to verify these guard clauses, we need unit tests. In the above case, we need two unit tests to verify these dependencies if we follow One Assert per Test rule. Increasing the number of dependencies might need more tests.

I have seen people getting more excited and checking the combination of these dependencies but that seems too much as there are generally unique guard clauses for each dependency. In the below code, we are trying to test the above construction logic. Here we are testing that setting any constructor's parameter to null should result in an exception. These tests are also very brittle as introducing a new dependency requires updating all of these tests.

Here we are using xUnit for unit testing. We are using Rhino Mocks for mocking the dependencies.

Code is a liability
Unit tests should follow DRY [Don't Repeat Yourself] principle of Extreme Programming. Their names should follow DAMP (Descriptive And Meaningful Phrases). Introducing any new type involves this arduous work of construction tests. Since we are not introducing a type just because we have fancy construction, but for the logic that it implements, putting effort for writing these tests takes this useful time away from logic testing.

Mark Seemann has provided AutoFixture.Idioms for such unit tests which tend to follow such common templates. This has a dependency on AutoFixture nuget package. Let's install these packages to our test project. We have also installed xUnit package as a unit testing library.



AutoFixture.Idioms provide GuardClauseAssertion to verify that an exception is thrown if a dependency is missing. In the following code we are trying to assert the guard clauses for all the dependencies for all the constructors.

But to test these guard clauses, it needs to pass the required dependencies as constructor arguments. But the dependencies cannot be instantiated as they are interface types. Running the above test fails with the following message:



AutoFixture has provided additional extension packages to support mocking libraries including Moq, FakeItEasy and RhinoMocks. Let's install the package which uses Rhino Mocks. Obviously this package has an additional dependency on Rhino Mocks. This package is called AutoFixture.AutoRhinoMocks.



After installing the above package we can add customization to the Fixture to support Rhino Mocks for mocking the required dependencies. The test would fail if there is a guard clause missing from any constructor for the specified type. This approach reduces the effort for unit testing the code required just for defensive code and we can concentrate on the real application logic.

Testing Guard Clauses for Methods
Methods can also have guard clauses to check for dependencies. We can use AutoFixture idioms to assert these clauses as well. Let' add an additional behavior to the IAuthenticationService method to support adding an authentication policy. We need to add the same method to the AuthenticationService implementation. The method just has one parameter of IAuthenticationPolicy type.

If we are missing the guard clause for null testing of the provided argument, the test would fail. It results in the following in your unit testing session. Here we are using resharper support for xUnit to run tests and show the results.



Adding an guard clause which results in an exception, would follow the expectation. It is shown as green in the session. The test would pass. It is shown in the testing session as follows:



In order to run the above tests, we needed to add a few nuget packages. The names and dependencies of these packages can be checked using Package Visualizer.



Why is this a great achievement?
Just to be able to run this test is a big achievement for software design and development world. Let me tell you what is happening here. Basically this test is a combination of a test runner with a different assertion library with mocking support from another provider. Here we are using XUnit for running our tests. Auto Fixture Idioms are helping us asserting the guard clause checks. In order to create mocks, AutoFixture is using Rhino Mocks. So this has test runner, assertion and auto mocking libraries from different providers. Together they are able to help us achieving this miracle. Isn't this great?

Download Code

Friday, May 10, 2013

Command Query Responsibility Segregation [ CQRS ] - An Introduction

Bertrand Meyer introduced the idea of CQS [Command Query Segregation] in his famous book Object Oriented Software Construction. He discussed about responsibilities of methods of an object. Each method should either be a Command or a Query method. It should not be both. As described here, asking a question should not change the answer. It should just give us the answer back. If you remember, a few months back we discussed about side effects free purely functional approach for designing methods. CQS helps us designing our types and their behaviors. We divide our methods into two categories, ones which modify the state (commands) and others which gives data back (query). Queries are free of side effects. Martin Fowler has also discussed about the approach its limitations for certain scenarios. But this is generally considered as a principle and is widely applicable.

CQRS is about taking CQS to the next level and encompass the whole domain. It is the abbreviation of Command Query Responsibility Segregation. Some people call it pattern, other call it an architectural style. I think we can leave the classification for some other time but there are some scholarly discussions [Distributed Object Computing Group - Washington University] which can help us identify it as one. Based on the point that it doesn't have a documented solution, I am more inclined towards classifying it as an architectural style.

When to use CQRS?
The name was coined by Udi Dahan and Greg Young. CQRS is applicable only on collaborative domains where a large set of users are working on a small set of data. The users interact with each other only this small set of data. There is no feedback by the system as the command wouldn't be processed immediately. We need durable handling of these commands by supporting persistence of these commands to accommodate system failures. The command shouldn't be scrapped until complete processing. There must be a mechanism for handling of duplicate commands.

Like any other architecture styles, CQRS is also not a silver bullet. Adopting CQRS by the system would deprive users of any immediate feedback. So it should be applied very carefully. Dividing systems in commands and queries makes our system highly scalable. It also improves maintainability.

The concepts of CQRS are simple but change in mindset is the most difficult part. So it is always good to come from a DDD mindset. This would open up your mind to design in terms of domain as a whole. Additionally this should help you understanding a lot of text and material put together by CQRS gurus where they reference DDD as it is a pre-requisite. You can also use CQRS on component level instead of the whole domain. So it might not be a top level architecture.

Why to Segregate Commands and Query?
I have just referred Interface Segregation Principle [ISP] an an example of segregation. We can very well use Single Responsibility Principle [SRP] keeping in view of commands and queries as separate responsibilities. See how well SOLID fits in every design and architecture discussion?? :) Let's take the step by step approach to understand CQRS similar to our discussion about Windows Identity Foundation where we argued why it is better to design security around claims.

As we discussed above that CQRS is different than CQS but the definition of commands and queries stay the same. Commands change the state of the system and Queries read the state of the system without causing any side effects.

Significance of Data
For an enterprise's everyday business needs Data plays a significant role. Modern organization use software systems to provide ease of data management. Software systems are designed for satisfying the business needs of our customers. The organizations which can efficiently create the needful data and has ability to process it become successful as they are able to make right decisions at the right time. Yes, knowledge is no more a virtue, it is power now.

These systems have a front-end for client facing. You can read MVVM Survival Guide to further study about designing client applications using MVVM architectural style. The front-end would be making calls to a back-end system. For simpleton applications this backend system might simply be a DataBase Management System. In this case, client applications make necessary calls in the supported query language [SQL]. It might also use DBMS specific database programming features including stored procedures and functions. The choice of DBMS is really important as data is of the highest priority in any system.



Enterprise Operations are the key
These systems work perfectly for simple applications involving CRUD [Create, Read, Update, Destroy] operations. But organizations have complex business processes which require interaction of multiple systems. These business processes have complex workflow and security requirements which don't quite fit in this client / DBMS based approach. Enterprise applications are designed for performing operations by authorized users. It is better to see these systems as a collection of operations and use cases. Back in my procedural programming days I remember saying a program is a set of functions. It is these operations for which we design our applications so our design must focus and built around them. Remember seeing a litany of applications back in earlier days of Information Technology? The moment we get our head away from CRUD approach and start thinking about enterprise operations, we start adding real value to enterprise application infrastructure. It is a paradigm shift and involves a lot of unlearning, which is the most difficult part. This is the core of domain driven design.



Classification of Operations
Building on the idea of command and query separation, we can easily divide enterprise operations into commands and queries. They can be provided to client applications as separate command and query services. Dividing our application as a set of commands enables us to provide Task based implementation. In DDD language, this is Shared Kernel approach for context mapping if we consider command and queries as separate bounded contexts. Both teams must agree on any changes on this shared model. There should be very strict continuous integration based tests to ensure that. Any breaking changes caused by requirement on one side should be planned so that the other team make necessary adjustment to accommodate the change.



Mostly we have different requirements for command and query models. Command models are less forgiving having strict validation requirements. On the other hand, query based models are more denormalized. For reporting purposes, these models also have in-built caching mechanisms generally which is not very common to command based models.

Command and Query Interfaces for Same Model
[Fowler] has discussed the simplest implementation as providing separate command and query interfaces. The model types should be implementing both of these interfaces. You might have seen the same business objects library shared between different projects and applications. They are examples of this implementation. I have seen this enough and this works in most of the situations. This style would be more useful with Dependency Injection where the concrete command services actually depend on the command interfaces, and query services are dependent on query interfaces. The actual model types are injected.



Separate Models for Commands and Queries
As we discussed above, although initially it might seem that we need the same model for both, studying the process in more details mostly suggests that only similar terminologies are being used by different domain experts. They are actually referring to some different concepts. In order to design these different concepts we can build altogether different models. This is the same approach as Separate Ways in Domain Driven design. It is also possible that both models inherit from a Shared Kernel based model for overlapping concepts and then develop their specific implementation on top of that. Doing that would save us from difficulties of planning for changes in shared kernel. Any changes in the domain model, can be planned separately by both teams. These changes should be resulting in model updates for providing a correct complete view of the system.



Scaling Command and Query Services
Based on the particular system we are trying to model, there might be different scaling requirements for commands and queries. It is possible that there are more frequent command operations or otherwise. In order to handle them properly, we can use separate deployment of command and query services. If it is a monitoring application, then we will definitely be getting more status updates (commands) then the queries to determine the state of things being monitored. The separate deployment can very well use the same model if that fits the requirement. In this case, since they are separate applications, we need to find a way to share the model between them. Historically the libraries are shared in a source control repository. While building these applications, we can use the recent version of the library. This sharing can be automated with nuget packages as well.



Or they can use completely different models like separate ways bounded contexts. In our example of monitoring application, the query service would mostly be management information system reports. They might also be complex event processing system designed specifically to ensure immediate action. It might be intensive care units of a hospital which require immediate response of these events.



Separate Reporting Databases
The date generated by commands is more transactional in nature. We might not need the same granularity for query services. Queries often require more aggregated data. This can be denormalized. It is possible to create a separate database for reporting purposes where we hold pre-processed data for data warehousing requirements. The process can run overnight in batches. This is generally read-only data with less historical view for faster queries. The data can be in terms of OLAP cube for easier slicing and dicing.



Related Patterns
There are other related patterns which are generally discussed in the same texts as CQRS. We would try to discuss them in some future post. They are as follows:
  • Event Sourcing
  • Process Manager
  • Eventual Consistency
Further Readings
http://cqrsjourney.github.com
http://pundit.cloudapp.net
http://www.cqrsinfo.com/category/programming/
http://martinfowler.com/bliki/CQRS.html
http://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf

Visual Studio Magazine Article On PCL

I have recently collected and summarized my thoughts into an article on Visual Studio Magazine. You can find the article here:

Tuesday, May 7, 2013

dotPeek peeks Nuget packages now!

JetBrains has announced today a new plugin for dotPeek. This is to provide support for peeking into nuget packages without downloading them. This allows us to source packages from the chosen nuget repository. We can search for appropriate package and open them in Assembly Explorer.



It is also possible to peek all nuget packages as referenced in a packages.config file.



As we discussed in our previous post that assembly explorer has already been added to Resharper 8 EAP, there might be future support of the same feature in Resharper 8. After installation, the package should be available in installed plugins list. This can be enabled or disabled by using the options here:



You don't need to download the dotPeek 1.1 EAP as the plugin is also supported for RTM version of the tool. You can find further details about the plugin along with download instruction on JetBrain's blog here:

http://blogs.jetbrains.com/dotnet/2013/05/peeking-into-nuget-packages-with-dotpeek/

Thursday, May 2, 2013

Resharper 8 EAP - New Features

Technology tools and components providers have learnt that involvement of community from the very early part of development process is very critical for a successful product. Microsoft now has Community Technology Preview (CTP) versions in order to bring community on board as the features are being added. Based on the community feedback the features are tweaked and improved. There is also user voice where you can add the features you want. You can also vote on the features entered by other community members. It provides a place for stake holders to give ideas about new features or improvements about other features. The example user voice pages can be found for Visual Studio and Windows Phone. You can also use connect page to submit bugs and suggestions for existing Microsoft Products. This includes us, as developers, being in complete product life cycle. Agile helps!!!

Jet Brains [the provider of Resharper] has a similar program to include developers as the features are being added to their products. They have named it Early Access Program [EAP]. I don't know when they started this but I used the first EAP version for dotPeek. Many of my friends migrated to it when RedGate put a price tag on Reflector.

Resharper 8 is still in EAP program. There is a build every 10-15 days with additional / improved / fixed features. You can download and play with it. For you it makes it easier for you to transition when the tool is out in production. If there is a defect then your feedback helps JetBrains fix it. In this post I wanted to discuss about the exciting new features that I can see myself depending on a lot when the product is out. The nightly build of Resharper 8 EAP version can be downloaded from here:



Once downloaded, the EAP version can be installed on the machine with administrator's rights. The supported versions which couldn't be found on the machine are just grayed out. Since I already have Visual Studio 2012 on my machine now, you can see that it is only providing me the option to select that visual studio version.



Resharper Extension Manager for Resharper Plugins
Resharper 8 is introducing a new mechanism for releasing and managing plugins through Extension Manager. This is similar to Extension and Updates feature in Visual Studio. Since resharper is itself a Visual Studio Extension, the name Extension Manager can be confusing for some. People might think that resharper has provided a new way to manage those extensions downloaded from Visual Studio or any custom Extensions & Updates gallery. The Extension Manager is supposed to manage Resharper Plugins. You can find it in Resharper Menu.



It has a very similar view as Visual Studio Extension & Updates Dialog. There are three tabs to search an online or custom plugins gallery. You can see the list of installed plugins and uninstall them. If there is an update to an already installed plugin, it can be found in Updates tab. These are the similar tab to Visual Studio Extensions & Updates dialog.



The extension galleries can be configured through Resharper's options. If you select Settings in the above dialog then it just opens up the Options dialog with Extension Manager settings selected. You can also open the same dialog from Resharper -> Option in main menu. Here we can add, update or remove an already configured gallery. We can also disable a gallery, which can be enabled again if you want.



As discussed above, the new Extension Manager can be used to install and uninstall a plugin. The feature also supports disabling a plugin. The configuration is in the Plugins tab in the same Options dialog. You can also find the other details about a plugin here including its version number.



Please remember that these plugins are provided by community and they can have their own license terms. You can also create your own plugin and host it on JetBrain's plugin store. There are currently only two stable plugins here. I don't know how costing would work here. I really liked that these plugins can have their own dependencies.



Assembly Explorer
It is very natural for a product to atleast provide those required features that a competitor provides. The products then add extra useful features to gain competitive edge. You can find the same pattern in all markets including smart phones. Since RedGate's Reflector integrates with Visual Studio, JetBrain also needed to included its assembly exploring tool into visual studio. dotPeek is resharper's pet to explore an assembly. The same feature are added in Resharper 8 EAP in the form of Assembly Explorer. You can find the tool in resharper's main menu.



Assembly Explorer is provided as another tool window which can be handled like any other tabbed window in Visual Studio. This should greatly help in multi-monitor scenario. Here we can see other assembly references to determine assembly dependencies. We can also see the list of namespaces and the types contained by these namespaces. We can drill down in the tree to get the details of the members of these types. It is pretty much the same as in Object Browser. The difference is that Object Browser has a few extra sections to provide type's member's details and documentation.



Code navigation is supported by double clicking an item in Assembly Explorer. There are different options provided to support this. One of the option is to use the dotPeek decompiler engine. This is a reverse-engineering utility from JetBrains. It also supports using Code Definition window.



An assembly opened in assembly explorer can be exported to project. The option is provided in the context menu for an assembly in Assembly explorer. It seems that the same feature is being added to dotPeek 1.1 EAP. So the limitations are same. It can only be exported in to a C# project.



You can provide the following settings for the new project. Once a project is already exported, it allows to use and open the already exported project. You can also create a new solution file and open the project in Visual Studio.



The list of assemblies currently loaded form a session. This can be saved to load the same session in the future. The list is exported as dotPeek assembly list (dpl) file.



I think this is a great feature but this can be improved by providing support in additional contexts. A user should be able to open assembly explorer wherever he can select an assembly in Visual Studio. It includes Project's assembly references and Object Browser. An extra context menu item can be provided here to see the details of assembly contents. Like Object Browser, it can also support providing code documentation. It doesn't have to add them in a separate tab. As as starting point showing the documentation in tooltip should work. I can understand that introducing a new window has given JetBrain more control but these features might have been integrated with Object Browser as enhancement feature in order to build on the current knowledge and experience of developers.

Command Line Tools
Resharper 8 is expected to come up with a few command line tools. You might be wondering when it is already integrated with Visual Studio then why do we need these additional command line tools. Basically these tools are not intended to be used on developer's machine, they are provided for Continuous Integration scenarios for build machines. These tools provide the same check as we have on developer's machines. They would provide an extra defense for pre-commit checks avoiding unaccetable check-ins. The tools can be downloaded from the line provided above as a compressed file. They are not part of the installation package for Reharper 8. I am planning a separate post to discuss about command line tools in Resharper.



Supporting xUnit based Unit Tests
xUnit is yet another unit testing framework. You can install it from Nuget Pacakge Manager in your Visual Studio Test Project.



Visual Studio 2012 doesn't have any built-in support for running MSTest based unit tests. Running xUnit based unit tests would result the following in Output window.



Resharper also doesn't have any built-in support either. Running the same tests in Resharper would result in the following:



But you can add this support by using an EAP plugin for resharper. Just search for the following in the Resharper's Extension Manager as follows and install it. I needed to re-install my Visual Studio to actually start using it.



Let's see how we can use it now. Let's assume we have the simplest calculator in the world. The calculator just has one method, called Add. The method takes two operands, adds them and returns the result.

Now we need to test this in xUnit. We can add xUnit tests by decorating the test methods with Fact attribute as follows:

Now just run it using resharper. You can notice that you can run it now and the output is shown as follows:



Resharper 8 EAP has also introduce action indicators to streamline between gutter marks and bulbs. You can find it in the editor:



We can always go back to settings based on earlier version of resharper in the options dialog.



Note: This is not an exhaustive list of new features in Resharper 8 EAP. There are various other improvements. To see them all you can visit Jet Brain's blog. These are the features which caught my immediate eye and got me excited. Let's continue to explore!!!

Download Code