Effective C# Item 42: Understand How to Make Use of the Expression API
.NET has had APIs that enable you to reflect on types or to create code at runtime. The ability to examine code or create code at runtime is very powerful. There are many different problems that are best solved by inspecting code or dynamically generating code. The problem with these APIs is that they are very low level and quite difficult to work with. As developers, we crave an easier way to dynamically solve problems.
Now that C# has added LINQ and dynamic support, you have a better way than the classic Reflection APIs: expressions and expression trees. Expressions look like code. And, in many uses, expressions do compile down to delegates. However, you can ask for expressions in an Expression format. When you do that, you have an object that represents the code you want to execute. You can examine that expression, much like you can examine a class using the Reflection APIs. In the other direction, you can build an expression to create code at runtime. Once you create the expression tree you can compile and execute the expression. The possibilities are endless. After all, you are creating code at runtime. I'll describe two common tasks where expressions can make your life much easier.
The first solves a common problem in communication frameworks. The typical workflow for using WCF, remoting, or Web services is to use some code generation tool to generate a client-side proxy for a particular service. It works, but it is a somewhat heavyweight solution. You'll generate hundreds of lines of code. You'll need to update the proxy whenever the server gets a new method, or changes parameter lists. Instead, suppose you could write something like this:
var
client =new ClientProxy
<IService
>();var
result = client.CallInterface<string
>( srver => srver.DoWork(172
));
Here, the ClientProxy<T> knows how to put each argument and method call on the wire. However, it doesn't know anything about the service you're actually accessing. Rather than relying on some out of band code generator, it will use expression trees and generics to figure out what method you called, and what parameters you used.
The CallInterface() method takes one parameter, which is an Expression <Func<T, TResult>>. The input parameter (of type T) represents an object that implements IService. TResult, of course, is whatever the particular method returns. The parameter is an expression, and you don't even need an instance of an object that implements IService to write this code. The core algorithm is in the CallInterface() method.
public
TResult CallInterface<TResult>(Expression
<Func
<T, TResult>> op) {var
exp = op.Bodyas MethodCallExpression
;var
methodName = exp.Method.Name;var
methodInfo = exp.Method;var
allParameters =from
elementin
exp.Argumentsselect
processArgument(element);Console
.WriteLine("Calling {0}"
, methodName);foreach
(var
parmin
allParameters)Console
.WriteLine("\tParameter type = {0}, Value = {1}"
, parm.Item1, parm.Item2);return default
(TResult); }private Tuple
<Type
,object
> processArgument(Expression
element) {object
argument =default
(object
);LambdaExpression
l =Expression
.Lambda(Expression
.Convert(element, element.Type));Type
parmType = l.ReturnType; argument = l.Compile().DynamicInvoke();return Tuple
.Create(parmType, argument); }
Starting from the beginning of CallInterface, the first thing this code does is look at the body of the expression tree. That's the part on the right side of the lambda operator. Look back at the example where I used CallInterface(). That example called it with srver.DoWork(172). It is a MethodCallExpression, and that MethodCallExpression contains all the information you need to understand all the parameters and the method name invoked. The method name is pretty simple: It's stored in the Name property of the Method property. In this example, that would be 'DoWork'. The LINQ query processes any and all parameters to this method. The interesting work in is processArgument.
processArgument evaluates each parameter expression. In the example above, there is only one argument, and it happens to be a constant, the value 172. However, that's not very robust, so this code takes a different strategy. It's not robust, because any of the parameters could be method calls, property or indexer accessors, or even field accessors. Any of the method calls could also contain parameters of any of those types. Instead of trying to parse everything, this method does that hard work by leveraging the LambdaExpression type and evaluating each parameter expression. Every parameter expression, even the ConstantExpression, could be expressed as the return value from a lambda expression. ProcessArgument() converts the parameter to a LambdaExpression. In the case of the constant expression, it would convert to a lambda that is the equivalent of () => 172. This method converts each parameter to a lambda expression because a lambda expression can be compiled into a delegate and that delegate can be invoked. In the case of the parameter expression, it creates a delegate that returns the constant value 172. More complicated expressions would create more complicated lambda expressions.
Once the lambda expression has been created, you can retrieve the type of the parameter from the lambda. Notice that this method does not perform any processing on the parameters. The code to evaluate the parameters in the lambda expression would be executed when the lambda expression is invoked. The beauty of this is that it could even contain other calls to CallInterface(). Constructs like this just work:
client.CallInterface(srver => srver.DoWork( client.CallInterface(srv => srv.GetANumber())));
This technique shows you how you can use expression trees to determine at runtime what code the user wishes to execute. It's hard to show in a book, but because ClientProxy<T> is a generic class that uses the service interface as a type parameter, the CallInterface method is strongly typed. The method call in the lambda expression must be a member method defined on the server.
The first example showed you how to parse expressions to convert code (or at least expressions that define code) into data elements you can use to implement runtime algorithms. The second example shows the opposite direction: Sometimes you want to generate code at runtime. One common problem in large systems is to create an object of some destination type from some related source type. For example, your large enterprise may contain systems from different vendors each of which has a different type defined for a contact (among other types). Sure, you could type methods by hand, but that's tedious. It would be much better to create some kind of type that "figures out" the obvious implementation. You'd like to just write this code:
var
converter =new Converter
<SourceContact
,DestinationContact
>();DestinationContact
dest2 = converter.ConvertFrom(source);
You'd expect the converter to copy every property from the source to the destination where the properties have the same name and the source object has a public get accessor and the destination type has a public set accessor. This kind of runtime code generation can be best handled by creating an expression, and then compiling and executing it. You want to generate code that does something like this:
// Not legal C#, explanation only
TDest ConvertFromImaginary(TSource source) { TDest destination =new
TDest();foreach
(var
propin
sharedProperties) destination.prop = source.prop;return
destination; }
You need to create an expression that creates code that executes the pseudo code written above. Here's the full method to create that expression and compile it to a function. Immediately following the listing, I'll explain all the parts of this method in detail. You'll see that while it's a bit thorny at first, it's nothing you can't handle.
private void
createConverterIfNeeded() {if
(converter ==null
) {var
source =Expression
.Parameter(typeof
(TSource),"source"
);var
dest =Expression
.Variable(typeof
(TDest),"dest"
);var
assignments =from
srcPropin
typeof
(TSource).GetProperties(BindingFlags
.Public |BindingFlags
.Instance)where
srcProp.CanReadlet
destProp =typeof
(TDest). GetProperty( srcProp.Name,BindingFlags
.Public |BindingFlags
.Instance)where
(destProp !=null
) && (destProp.CanWrite)select Expression
.Assign(Expression
.Property(dest, destProp),Expression
.Property(source, srcProp));// put together the body:
var
body =new List
<Expression
>(); body.Add(Expression
.Assign(dest,Expression
.New(typeof
(TDest)))); body.AddRange(assignments); body.Add(dest);var
expr =Expression
.Lambda<Func
<TSource, TDest>>(Expression
.Block(new
[] { dest },// expression parameters
body.ToArray()// body
), source// lambda expression
);var
func = expr.Compile(); converter = func; } }
This method creates code that mimics the pseudo code shown before. First, you declare the parameter:
var
source =Expression
.Parameter(typeof
(TSource),"source"
);
Then, you have to declare a local variable to hold the destination:
var
dest =Expression
.Variable(typeof
(TDest),"dest"
);
The bulk of the method is the code that assigns properties from the source object to the destination object. I wrote this code as a LINQ query. The source sequence of the LINQ query is the set of all public instance properties in the source object where there is a get accessor:
from
srcPropin typeof
(TSource).GetProperties(BindingFlags
.Public |BindingFlags
.Instance)where
srcProp.CanRead
The let declares a local variable that holds the property of the same name in the destination type. It may be null, if the destination type does not have a property of the correct type:
let
destProp =typeof
(TDest).GetProperty( srcProp.Name,BindingFlags
.Public |BindingFlags
.Instance)where
(destProp !=null
) && (destProp.CanWrite)
The projection of the query is a sequence of assignment statements that assigns the property of the destination object to the value of the same property name in the source object:
select Expression
.Assign(Expression
.Property(dest, destProp),Expression
.Property(source, srcProp));
The rest of the method builds the body of the lambda expression. The Block() method of the Expression class needs all the statements in an array of Expression. The next step is to create a List<Expression> where you can add all the statements. The list can be easily converted to an array.
var
body =new List
<Expression
>(); body.Add(Expression
.Assign(dest,Expression
.New(typeof
(TDest)))); body.AddRange(assignments); body.Add(dest);
Finally, it's time to build a lambda that returns the destination object and contains all the statements built so far:
var
expr =Expression
.Lambda<Func
<TSource, TDest>>(Expression
.Block(new
[] { dest },// expression parameters
body.ToArray()// body
), source// lambda expression
);
That's all the code you need. Time to compile it and turn it into a delegate that you can call:
var
func = expr.Compile();
converter = func;
That is complicated, and it's not the easiest to write. You'll often find compiler-like errors at runtime until you get the expressions built correctly. It's also clearly not the best way to approach simple problems. But even so, the Expression APIs are much simpler than their predecessors in the Reflection APIs. That's when you should use the Expression APIs: When you think you want to use reflection, try to solve the problem using the Expression APIs instead.
The Expression APIs can be used in two very different ways: You can create methods that take expressions as parameters, which enables you to parse those expressions and create code based on the concepts behind the expressions that were called. Also, the Expression APIs enable you to create code at runtime. You can create classes that write code, and then execute the code they've written. It's a very powerful way to solve some of the more difficult general purpose problems you'll encounter.