Tuesday, May 29, 2012

C# Generics & Arithmetic Operators

In this post we will be discussing using arithmetic operators with C# Generics. Basically C# does static type checking for operators. If compiler cannot determine the actual types it is being applied to, then it results in an error stating that the operator cannot be applied for the type. In the case of generics, since the type is not known, it cannot apply the relevant operator to the expression. In the following, we are using an arithmetic operator '+' to an instance of a type argument and you can see that the compiler doesn't seem to like it.


In the following discussion, we would try to find a solution so that we could use the arithmetic operators in this case.

Let's introduce a generic interface ICalculator<T>. It behaves like a simple calculator and exposes Add, Subtract, Multiply and Divide on the arguments. The type of argument is the same as the type argument of the generic interface.


The simplest implementation of the interface can be as follows:


This is definitely not a production quality code as there is no error checking. But this minimizes the noise keeping focus on the topic under discussion. Now let's try to build the code. As expected compiler doesn't seem to like these arithmetic operators applied to operands of type as Type Arguments. And the compiler is right. It can not do static type checking of these operators as the actual type is not known at compile time.


As discussed in the previous post, generics support various constraints. Based on these constraints, the compiler allows the type arguments to be used in a certain way. The list of constraints can be found here: [http://msdn.microsoft.com/en-us/library/d5x73970%28v=vs.100%29]. There is no such constraint where we could specify that "T" is a numeric type. The compiler shouldn't have any issues in that case as the arithmetic operators can be used with numeric types. The other problem is that even if we were able to specify any such constraint, there is no super type that all of these numeric types inherit from. They are not even reference types that they support inheritence.

One argument would be to avoid using the Generics altogether and provide the Calculator implementation using concrete type. But this would really be cumbersome to implement separate type for each numeric type. Please don't hit me to say that :)

Basically we are not focusing on the main issue when try to find the solution. Static Type checking for operators is the main issue. That is why we are not able to use these operators with these operands. So, the solution could be provided by using "dynamic". Using dynamic would create an expression tree for the expression which would be resolved at runtime. Please refer to this post where we have discussed how DLR uses expresion trees [http://www.shujaat.net/2012/05/expression-trees-part-i.html]. Let's update the Calculator implementation as follows:


Now the code should build successfully. We can use the calculator as follows:


Since there is no type checking at compile time, we can run into issues like these when runtime cannot find the operator overloaded for any type used as type argument.


But this is a general issue possible with use of dynamic feature which can be addressed through various checks with the use of dynamic. There is already enough material on this already.

Download Code:



Monday, May 28, 2012

Entity Framework Code First DbContext - Validation

I wrote this article last month for HeadSpring but couldn't announce it here. This is about validation using Entity Framework DbContext API. I have discussed various options to validate the entities and when it makes sense to use a particular option. You can find the article here:

http://www.headspring.com/2012/04/entity-framework-code-first-dbcontext-validation

Sunday, May 27, 2012

C# Generics Constructor Constraint & Guarding Arguments

.net framework 2.0 introduced several great features which made a life easier for us developers a lot. One such feature is generics. The feature is closer to C++ templates but it has runtime type substitution compared to C++ templates which takes place at compile time. There are other differences too. This post is not about the introduction of the feature but this is about a limitation of the feature and possible workaround. The greatest read about the feature introduction can be found here:

http://msdn.microsoft.com/en-us/library/512aeb7t

Let's introduce an type which can be used to guard the parameters of a method. We will be adding specialized methods which can be used to verify if the argument fulfills certain parameter constraint. If not, generally an exception is thrown. We will be adding the flexibility to provide additional methods where we can just check the value rather than causing an exception.


We can use this Guard type as follows:

The developers who go back and forth between java and C# miss / appreciate the missing feature of throws for C# method. They miss them because there is no compile time support to check whether a possible exception is handled gracefully. They appreciate it because there are no handcuffs to handle exception which they don't want to. It also results in better versionability and scalability. Anders Hejlsberg has discussed in detail in this interview why they decided not to provide this feature in C#.

http://www.artima.com/intv/handcuffs.html

Now we come back to the topic and discuss what the above code does. As discussed it provides to methods. The actual verification code is provided by the client code because they better know what and how to verify. We just need to execute the code they provide. The constraint about verification code is that it must return a boolean. If verification fails then an exception is generated. The type of the exception would be the based on the type argument used with ArgumentGuard.

We are definitely the happiest developers in the world because we are able to provide an implementation which is simple and generalized. The client code not only would specify what and how to verify but it would also specify what exception should be resulted if the check in their own code fails. It's just like going to subway and choosing your options. This is exactly what they say, Tell me how do you want your sandwich. The only thing is that they don't let us make it ourselves and I don't forgive them for this mean attitude :).

The problem happens when one of the client developer shows up at your desk and requires the support of a particular exception message. That shouldn't be a big deal, we can just use the ArgumentException constructor which supports a message parameter. This might be the first thought which might come to your mind.


Well I have a bad news if you haven't got the issue with the above code. It seems that the compile doesn't like this and pukes this on us.


MSDN has details about the list of constraints possible with Generic Types [http://msdn.microsoft.com/en-us/library/d5x73970%28v=vs.100%29]. So you wouldn't be able to create an exception instance by using one of the constructor of ArgumentException which supports the parameter. The developer mindset puts us to bypass this limitation by this approach:


This approach works most of the time as we are bypassing the type argument constructor constraint by using the legal parameterless constructor. The type argument's type constraint lets us sets the property and doesn't seem to mind. But this is also not possible in this case as the Message property is read-only...Damn!!!


Am I still alive? A non-dead person always has some options in even the most desperate situation. Since we seem to be stuck here, we might ask this question to ourselves. But actually we haven't run out of options yet. One option is to get rid of the idea of using generics at all for this feature. Why should we use a language feature which doesn't support something that we actually need. Since there are only a few child classes of ArgumentException. This shouldn't be a big deal. We can make client's life easier by naming them as such that these types appear together in the intellisence so that client can make better decision about which ArgumentGuard he needs to use.

An evil developer still has some arrows to shoot. Let me say this and then we will use it, "We can set the value of a read-only field through reflection". Feels light once you let it out, right? So what if we set the Message using reflection, it should be fine. Just sneak it by avoiding the crappy purists like myself and you should be fine :(


Thanks God that it wouldn't work. Stupid myself, Message is not a field rather it is a property with no setter at all. It's not like that there is a setter but it is private. There is none whatsoever. We can use dotPeek to peek the definition of the type. It is overriding the Message property from its parent type but still providing no setter. Had there been a private setter, it would have worked like a charm.


So we can't set Message because it just has getter and no setter. Can we use the backing field for this property? In the base class in the hierarchy, eventually it is backed by _message field in System.Exception. On the way, to Argument's children, Message is overridden a few times but if we don't change anything else then the code below should work most of the times.


Since _message is an internal field in mscorlib assembly, we needed to use relevant BindingFlags to access the field. Now we can update the client code as follows:


In the above code, the verification criteria demands that a must not be null. Since the type argument is ArgumentNullException, it is expecting an ArgumentNullException with the appropriate message set. Let's try calling the method with null argument and test if it works as expected.


Now when we run it we do get the appropriate exception with the exact message as expected.


Hmm, this is working as expected. We are able to use generics for argument guard. We are also able to provide the expected exception message. But can you really call it a good design. Poking our noses into others private and internal members is never a recommended approach. The API owner is free to change the internal implementation as much and as often. As long as they are not changing the public behavior of types, they are not the one to blame if something in our code breaks just because they changed their internal implementation.

But is there really a way out? Aren't we stuck? We can't set the Message property (no setter constraint) and we can not use parameterized constructor (generics constructor constraint). It's time to take a step back and see do we really need to instantiate the Exception in Verify method. We just need to make sure that the exception is generated with appropriate message. What if we turn the table around and ask the caller to provide us the exception instantiation code. We will be generating the exception as provided to us by the caller himself. We don't need to set any property because it is now caller's responsibility to provide us with appropriate object with all properties set. If required, we will be executing the code provided by caller. There are no memory issues as the object is constructed only when required. This code would provide us the relevant exception object and we will be throwing that exception. Here we can constraint the type of exception based on the same type argument. Let's provide an overload of the same Verify method.


Now we can use the above guard argument helper as follows:


Now let's run the code again and see how it works.


Zindabad!!!

Download Code

Tuesday, May 22, 2012

Expression Trees - Part I

Expression trees are the representation of code as data structure instead of executable code. We have used expression trees for the discussion regarding INotifyPropertyChanged but never got a chance to discuss the concept itself.

The feature was introduced in .net framework to support meta-programming in .net framework i.e. to be able to treat code as data. The feature enables us to analyze the code at runtime. It also allows the run-time creation of code. In INotifyPropertyChanged example, we used Expression trees to generate new code based on the expression passed as argument. This involved code analysis and generation which was only possible by the use of expression trees.

Expression tree works by creating a semantic model of code. This semantic model can be used by compilation tools to generate runnable code.

What does it enable me?
Expression trees enables various feature both in the framework and to the developers. It can be used for:
  1. Dynamic modification of executable code.
  2. The execution of LINQ query.
  3. Creation of dynamic queries.
Expression Trees In .Net Framework
The expression tree types were introduced in .net 3.5. They are further enhanced in .net 4.0. The types can be found in System.Linq.Expression namespace in System.Core assembly. The class hierarchy of the types can be presented as follows:


Amongst the class in the hierarchy, there are three sealed classes:
  1. BlockExpression
  2. GotoExpression
  3. Expression<TDelegate>
How to create Expression Trees?
Expression trees can be created using the following:
  1. Using a lambda expression
  2. Manual creation using the available API
We will be discussing both of them in this post.

Assignment of Lambda Expression
A lambda expression can be assigned to an expression tree or a delegate. It cannot be assigned to an implicitly typed variable. Basically C# compiler emits an expression tree or delegate based on the destination variable type. If we are using an implicitly typed variable then it would not be possible for the compiler to decide and hence the error.


We can also base our expression tree off of Action delegate if the expression is not returning any data. In the following example we are just creating an Expression Tree which just prints a string on the console. It doesn't return any data so we are using Action delegate here. We compile the expression into delegate in the same way as we did before. Now we can use it like a regular delegate.


When a lambda expression / anonymous function is assigned to a delegate it creates a reference for the executable code which might be used to call the executable function. On the other hand, assigning to expression tree creates a representation of the code of the lambda expression (anonymous function). This can not be executed directly but it may be used to find out how the executable code would look like. We can also tweak the code by playing with the nodes of expression tree. We can also execute other code based on the nodes in expression tree. That is exactly what we did with the example referred in first paragraph. In the example code, we determine the property (member expression) in the expression tree and then we raise PropertyChanged event for the said property. Doing this relieved us from using the magic string in the setters of our models and view models.

Conversion between Expression Tree & Delegate
An expression tree can be converted into a delegate by compiling it. It must be remembered that the reverse is not true i.e. a delegate can not be converted to an expression tree.


If a conversion exists from an anonymous function to a delegate type D, a conversion also exists to the expression tree type Expression<D>. Based on the same principle, the compiler converts the lambda expression, passed as argument, to the type of parameter. The parameter type might be either ExpressionTree or Delegate.

Building Expression Trees:
Expression trees are a lot easier to build if we think bottom-up. We can start building from leaf and work our way towards the top of the tree. It is a lot easier if we make a rough sketch of how the tree structure would be and then build the expression tree.


Let's first build a tree to represent the expression in the above algorithm.


If we were using lambda expression, then we can easily create a lambda to assign to a variable of Expression type. We can also use this to create a delegate. We can also use the delegate to generate some result.


Here the lambda expression returns and integer based on the evaluation of an expression. The lambda expression is assigned to an Expression type of variable. This is then compiled into a delegate as described above. The result is computed by the delegate and result is returned. This is rather confusing and not recommended use of expression tree. Why in the world we created the expression when we just needed an executable code. We could have rather assigned the lambda expression to a Func delegate and use that for computing. Or we could just return the result of Lambda expression and the framework would take care of creating an implicit anonymous function for the lambda expression. We would certainly never use this for production code. But in order to understand the feature, we would continue to use this to check the correctness of our expressions.

How to introduce variables:
More than usual we need expressions involving variables. In Expression Trees, variables are specified using Parameter Expressions. We have two options to declare parameter expressions.
  1. We can instantiate ParameterExpression directly.
  2. We can use Expression's factory method i.e. Expression.Variable to get the instance of parameter expression.
I prefer using the 2nd option and that seems to be a recommendation in documentation as well.


We can also use Expression.Parameter to create a ParameterExpression. In the following example, above example is recreated just changing Expression.Variable to Expression.Parameter.

This seems similar to the previous example except that we introduced Expression.Parameter instead of Expression.Variable.

Debugging Expression Trees
Visual Studio has little support of debugging expression tree. We can see how expression tree structure looks like while debugging. Just hover over an expression tree variable and use TextVisualizer to see DebugView.


The debug view shows the expression tree in a particular format. Different Node types are represented differently. The details of the format might be found here [http://msdn.microsoft.com/en-us/library/ee725345%28VS.100%29.aspx]


We can also use Html Visualizer to see the expression tree in the same format as the Text Visualizer.


Instead of hovering over the expression variable, we can use quick watch to see the same expression tree.


We can also use Immediate Window to see the same result.


Expression Trees & Dynamic Language Runtime
.Net CLR makes use of Dynamic Language Runtime [DLR] to provide support to dynamic languages or dynamic features of a language. The runtime uses expression trees for the dynamic code. It passes the expression tree to the appropriate runtime binder. The runtime binder selection would depend on the dynamic code including C# dynamic binder, Iron Python dynamic binder or for legacy COM Objects.


The runtime binder maps the request to the target's object call structure.


While binding, it might not find the expected properties / operations on the target object which might result in some exceptions. Now you realize the Microsoft.CSharp assembly reference to your .net 4.0 projects.

The use of expression trees by DLR happens on backend by the runtime. All you need to care about is the exception which could result.

Limitations:
It must be remembered that lambda statements cannot be used to build expression tree. These are lambdas specified in terms of code blocks [in c# they are defined within the braces {lambda_stamement}] which might return some values.