Muhammad Shujaat Siddiqi: functional programming

Showing posts with label functional programming. Show all posts

Thursday, June 20, 2013

Partial Applications using Currying & Function Compositions

C# introduced Func and Action delegates in .net framework 3.5. They are part of C# 2 language specification. It also introduced expression trees and lambda expression in the same framework version. It's amazing how technologies evolve and new technologies are built on top of the existing ones. They were built on top of Generics introduce in .net framework 2.0. In this post we are going to discuss how we can use these constructs to introduce currying, partial applications and functional composition.

Previously we have discussed how we can use dynamic call site to avoid compile time check for arithmetic operations in a generic type. You can find the discussion here [http://www.shujaat.net/2012/05/c-generics-arithmetic-operators.html]. We will be assuming that calculator has been provided to us as a third party library and we will be deriving more meaningful operations (functions) from the existing methods using functional approaches. Referential transparency is the essence of functional programming providing side-effect free operations. We discussed about purely functional approach here [http://www.shujaat.net/2012/07/design-guidelines-purely-functional.html]

Currying
Currying is a transformation technique in functional programming space. It is used to convert a function with multiple parameters to a set of functions with single parameter each. It can be used to create refined algorithms where partial list of arguments can be provided to create more sophisticated operations. The nomenclature is in the honor of Haskell Curry, the creator of Haskell functional programming language. It must be remembered that the end-result of a curried version of a function should exactly be the same as its non-curried counterpart.

C# 4.0 provided optional parameters. These are the parameters with default values. If the caller doesn't want to pass arguments to those parameters then the default value is used instead. They are always at the end of the parameter list. This is opposite to the partial application case as the fixed arguments here. They would be the first parameters to a function. You might want to use these two concepts together as well. We can introduce the support for currying a function using extension methods. We can introduce these extension methods for the required overloads of Func and Action delegates. The following is an example of currying where we are converting a method with two parameters to a set of methods with single parameters.

The above code uses the Closure and Captured variables. Being a captured variable, operand1 is available inside the inner lambda expression. The whole feature is available due to closure where the life time of the context of lambda expression declaration is upgraded to the life time of expression itself allowing the use of captured variables. Actually we have discussed about closures in the context of async code execution using Dispatcher in WPF applications.

Partial application is about fixing the arguments to a function. An application has historically been referred as a set of functions. So the remaining method with some fixed arguments is referred as partial application. So we first obtain the curried version of a function and then provide fixed arguments. Wikipedia has a quick reference about the concept.

Now let's see how we can use currying and partial applications to create more meaningful operations. In the example below we are using the methods provided by the above Calculator class to derive calculatePlusPlus, calculateAdditiveInverse and calculateMultiplicativeInverse functions. Here we are fixing first arguments to the curried versions of the delegates based appropriate to the operation e.g. for calculatePlusPlus we can fix the first argument to 1.

The output of the above code would be as follows:

We are certainly not restricted to use only Func delegates. We can also use Action delegates if needed. Lets' create an extension method which curries an Action delegate with two parameters to a set of Action delegates. Don't get confused with the combination of Func and Action delegates being used here. You can just be concerned about the last Action, which refers to the actual operation. The rest of the design is just to curry it.

Here we are introducing a lambda statement to print a message specified number of times. We are assigning the statement to an Action delegate. After currying the delegate, we are creating a partial application printThreeTimes by fixing the first argument to 3. This would cause the message to be printed three times.

I can understand you might be thinking if this is possible to UnCurry a curried function or not. Actually we can do that too. Below is the extension method accepting a curried Func delegate and returning an Un-Curried version of the delegate.

Here we are UnCurrying a curried version of divide operation from Calculator. We are then calling it using the specified arguments as follows:

Functional Composition to construct data / processing chains
Functional composition is the method where multiple functions are used in a serial fashion in such a way that the output of one method serves as the input of other method.This can be used to build processing chains. I am deliberately not calling it a pipeline because a pipeline should have producer / consumer based intermediate queues. We have discussed how we can implement processing pipelines during the discussion of TPL Dataflow. You can find the discussion here: [http://www.shujaat.net/2013/05/processing-pipelines-with-tpl-dataflow.html].

The processing chain created through functional composition can also be specified using extension methods. The following defines an extension method on a Func delegate with two parameters. After getting the result from the first, it just passes the results to the second Func delegate. Obviously the next element in the chain should have only one parameter. After the second element completes, it just returns the result to the caller.

In the following code we are providing support of calculating percentage using the available Divide and Multiply methods from Calculator. As we know percentage of a part in a total is computed as [ percentage = (part / total) * 100 ]. So we need to obtain a curried version of Multiply & fix the first argument to 100 and use it after dividing part by the total.

It is also possible that the processing chain is not returning any data back to the caller. In this case, all the elements in the chain do pass the required data to the next element in the chain. The last element just uses the data for its purpose and nothing returns back to the caller. We can implement this using chain of Func delegates with an Action delegate at the end.

Let's see how we can use it to print the percentage of a part. Here we are using the function composition created for calculating the percentage. We are adding an element to construct a message. At the end we are using a processing element to print the message to the console. For the last element, we are using the curried version of printMessage with first argument fixed as 1.

The output of the above code would be as follows:

Saturday, June 9, 2012

Design Guidelines: Pure Functions & Private Static

If y is a function of x then x is called domain (independent variable) and y is called range (dependent variable).

y = f (x)

As we know from our calculus courses that vertical line test could be used to determine if there exists a function. It ensures that for a single domain value (x), there exists a unique range value (y)i.e. classic function is Single Valued. The function should always evaluate the same value no matter how many times we use the function i.e the function is Stateless. It is perfectly possible for two domain variables (x) to have the same range value (y) e.g. Identity function.

In modern programming languages, the concept of function has been adopted in a modified way. This pushes us to discuss why we have used "modified way" here. In order to make life easier for developers some of the classic function restrictions were relaxed by classic programming languages when they introduced concepts like Global Variables and multiple parameters. Object Oriented programming languages attempted to limit a few things but they started calling it method. The method has accessibility to all the instance and type members of the same type. It also has visibility of most members of the parent type in an inheritence relationship.

There is certainly no doubt that the life of a regular programmer developing simple application is a lot easier now. But advancement in technology which provided us the flexibility to have multiple flows of execution even within the same process. They multiple flows of execution were named as Threads and the concept was termed as Multi-threading. Now the method can be CALLED by multiple threads at the same time. It is possible that two or more threads are executing the same method simultaneously. Since we are using the instance members directly in the method's code, any change in the member would be reflected for the other which might cause unexpected result for the other threads. The problem happens just because that our methods are not free from side effects. They have a defined purpose to evaluate some result and as a side effect they can use / modify instance members. This is a given feature of Object Oriented Programming technology. Modern programming languages provide various synchronization techniques (such as locks) to avoid ending up in an unexpected state.

Side effects are lies. Your function promises to do one thing, but it also does hidden things. Sometimes it will make unexpected changes to the variables of its own class. Sometimes it will make them to the parameters passed into the function or to system global. In either case they are devious and damaging mistrusts that often result in strange temporal couplings and order dependencies. [Robert C. Martin - Clean Code: Chapter 4 - Functions]

The synchronization techniques really help us avoiding the side effect related issues of multi-threading. But the issue is that they help us fight with the symptoms and not the actual cause. The cause is unnecessarily relaxing the limitations of functions for the ease of development. There are several things to consider but we can start with avoiding use of any instance level member as much as possible in our methods. Doing that would definitely help us in avoiding the side effects. We will be making less use of synchronization techniques. Not only the performance of our code is better but the code is more naturally organic and readable. It is also quicker to program.

Oliver Sturm has a number of suggestions regarding defining pure functions. We are discussing one of the suggestions in this post. i.e.

Define thou's private methods as static.

Basically using the static modifier with a method is a reminder to the compiler to check whether any instance member is accessed in the method. If it is so, then it issues a compiler error stating the problem. This can serve as a warning to the developer making any change to the code later on. All the future refactoring attempts would also be considering the static nature of the method in mind.

The following is a type called CoffeeCard. This card can be used to pay for the coffee so that you don't need to carry cash to the coffee shop as long as you are maintaining enough balance. The type allows us to manage card balance. It allows us to add balance and charge for the coffee for a certain balance. We can check balance anytime by calling GetBalance() on the type. AddBalance() and ChargeCard() utilizes private methods to update balance and push the changes o the database.

As suggested above, let's update the private methods to static.

As soon as we change the method to static, the JIT compiler starts complaining about the use of _balance field, which is an instance member. As we know that any instance member from the same type cannot be used directly in a static method of the same type.

Basically the solution of this problem is easier than we think. We just need to change the signatures of private methods to accept the balance as a parameter. Since we need to update the _balance field after the operation, we can return the calculation result which can be used by the caller of these methods to update the instance member [_balance field]. Since these are our private methods, we don't even need to worry about outside world while refactoring them as it's none of their business how we implement functionality within our type. As long as we are not changing the behaviors of instance, we should be good which, in this case, we are not.

Let's see how we can use these methods in the public methods of CoffeeCard type.

Now our private methods are thread safe, they are deterministic with no side effects, they are context free. The biggest of all achievements, they are thread safe.

Does the method really belong to this type?
Resharper's suggestion is mainly because of the idea, "Since this method is not using any instance member, it is not an instance member itself and we should rather be specifying it as a type's member". In C#, we assign type membership using static. If it were VB, we would have specified the methods as Shared.

After you have updated the method to be static, just ask this question to yourself: Does this method really belong to this type? If the answer is Yes then you are done here. Smile and move ahead. But if the answer is No then we should continue on the road of design improvement. It seems that you were keeping this functionality here. It happens when Single Responsibility Principle is avoided. If this is some code which goes above and beyond the type's responsibility then we can move this to some other type. We might introduce new types and move this functionality there. Doing this would also improve our unit testing as we will be adding this functionality in public methods of the other type and hence we can unit test that.

If we want to add a new type (class), then we need to decide where we need to add this. We keep the visibility of the type to as the minimum. This depends on the the other types which might need the functionality of this new type. The choice of access modifier of a type is an important decision. Keeping all the types as public is exposing yourself to the infinite number of ways the types can be used. If we don't see that any client would be using our type from outside the assembly, we should never be making it public. If the new type is not expected to be used outside this scope of the class it is refactored from then it would make more sense to be keeping this class nested in the original class as follows:

Here we have kept the BalanceCalculator's access domain as private. This is allowed for a nested class in C#. Since we don't think that the class can be used by any code outside the parent class so it seemed like the best decision. The refactoring also made us realize that adding the balance and charging cards is not just an arithmetic operation. We might also need to consider the effect of any on-going promotion and discount.

Resharper Support
Resharper also provides support to suggest about the members which can be declared static. For our case, the suggestion can appear as follows:

This is definitely a configuration option and we can change the severity of resharper option to a value other than suggestion as well.

FxCop Rule CA1822
FxCop Rule CA1822 also suggests marking those methods as static which have no reference to any instance member.

http://msdn.microsoft.com/en-us/library/ms245046.aspx

Wednesday, July 23, 2008

Functional Programming DataTypes F# .net

Functional Programming uses immutable data types. Those who are aware of the difference between String and StringBuilder in .net understand that the first is an immutable datatype but the other is mutable datatype. For those of you who are not familiar with the concept, immutable objects are not allowed to be updated. The updates that you make in the String are basically not the same object but each updated creates a new object and the earlier is presented to be collected and disposed by the runtime.

For compliance with .net, F# supports .net datatypes. But it considers them as mutable datatypes. It also introduces new datatypes, which are immutable ones. So with this we know that there are two categories of datatypes in F#, one is mutable and the other is immutable datatypes.

The immutable datatypes introduced by the language include tuples, records, discriminated unions and lists. A tuple is a collection which can hold more than one values. The number of values should be known at design time. Record is a specialized tuple in which each value may be named e.g. Age = 12. The list is a regular linked list. The discriminated union is like c style unions but it is typesafe. Like C, it defines a type whose instance may hold a value of any of the specified datatypes.

Though F# is a strongly typed language which used type inference. It deduces these datatypes during compilation.

Muhammad Shujaat Siddiqi

Thursday, June 20, 2013

Partial Applications using Currying & Function Compositions

Saturday, June 9, 2012

Design Guidelines: Pure Functions & Private Static

Wednesday, July 23, 2008

Functional Programming DataTypes F# .net

Buy me a cup of coffee

Followers

Pageviews last month

Translate

Blog Archive

Who am I ??

Friends