Feeds:
Posts
Comments

Archive for the ‘Reflection’ Category

Ever wonder how to use a sample object to build a LINQ Query Expression that could find that object and others like it? The other day this Stack Overflow question caught my eye because of my experiments with MongoDB and started me to thinking about this topic. Let’s say we have an object and we want to use it as a template or pattern to find others like it in a list (e.g. find other twitter users like me). Let’s further say that we aren’t talking about a class specific implementation but a generic one. How would we do it?

There are a couple of issues:

  • determine which object properties we want to use for search
  • create the query from those properties
  • apply the query to a set of objects

I’d probably implement the first item with custom key attributes on the object properties because they are easy to change. One could also use a mapping class to target the desired keys. This would be useful if we had different sets of properties we wanted to use at different times.

The third item becomes trivial if the second provides a LINQ query so that is what we’ll do.

Let’s start with a restatement of the behavior:

When asked to build a query with a single key we should get a non-null query result and
should be able to find matching objects in a list with it.

To start with we need an object that will provide our keys for this exercise:

public class Employee
{
	public string Name { get; set; }
	public int Age { get; set; }
}

Now to restate the desired behavior as a test (using the FluentAssert BDD framework):

	[Test]
	public void Given_a_single_key()
	{
		Test.Given(_queryBuilder)
			.When(asked_to_build_a_query)
			.With(a_single_key)
			.Should(not_get_a_null_query)
			.Should(be_able_to_find_the_employee_with_the_query)
			.Verify();
	}

And start building out the requisite pieces:

	private void asked_to_build_a_query()
	{
		_queryExpression = _queryBuilder.BuildQuery(_employee, _keys);
	}

	private void a_single_key()
	{
		_keys.Add(typeof(Employee).GetProperty("Age"));
	}

	private void not_get_a_null_query()
	{
		_queryExpression.ShouldNotBeNull();
	}

	private void be_able_to_find_the_employee_with_the_query()
	{
		var query = _queryExpression.Compile();
		var result = _employees.FirstOrDefault(query);
		result.ShouldNotBeNull();
		ReferenceEquals(result, _employee).ShouldBeTrue();
	}

	[SetUp]
	public void BeforeEachTest()
	{
		_keys = new List<PropertyInfo>();
		_employee = new Employee
			{
				Name = "John",
				Age = 23
			};

		_employees = new List<Employee>
			{
				new Employee
					{
						Name = "Sarah",
						Age = 34
					},
				new Employee
					{
						Name = "John",
						Age = 55
					},
				_employee
			};
		_queryBuilder = new QueryBuilder();
	}

then stub the query builder to make it compile

public class QueryBuilder
{
	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		return null;
	}
}

Now that we have a failing test we can start building the implementation. It can be patterned from the answer to the Stack Overflow question as:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var key = keys.First();
		var parameter = Expression.Parameter(typeof(T), "p");
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

That wasn’t so hard. Now how do we handle multiple keys? First the failing test to represent the desired behavior:

	[Test]
	public void Given_multiple_keys()
	{
		Test.Given(_queryBuilder)
			.When(asked_to_build_a_query)
			.With(multiple_keys)
			.Should(not_get_a_null_query)
			.Should(be_able_to_find_the_employee_with_the_query)
			.Verify();
	}

and necessary expansion of our test class:

	private void multiple_keys()
	{
		var type = typeof(Employee);
		_keys.Add(type.GetProperty("Name"));
		_keys.Add(type.GetProperty("Age"));
	}

The test fails because there are multiple Employees named “John” in the list being searched and the expected Employee is not the first one matched. How do we change the code to make query work? For now let’s say it is an OR match on all the keys. This could be something we make configurable later. So first refactor the BuildQuery method to combine all the keys in an OR expression:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var list = keys.Select(key => CreateLambda(item, key)).ToList();
		var lambda = CombineExpressionsWithOr(list);
		return lambda;
	}

	private static Expression<Func<T, bool>> CreateLambda<T>(T item, PropertyInfo key)
	{
		var parameter = Expression.Parameter(typeof(T), "p");
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

	private static Expression<Func<T, bool>> CombineExpressionsWithOr<T>(IEnumerable<Expression<Func<T, bool>>> expressions)
	{
		return null;
	}

To OR a bunch of items together with LINQ we have to build a tree of pairs.

An easy way to do that is with a queue. Put them all in a queue then as long as there is more than one item in the queue pull two, combine them and put the result back. This makes a short, full tree.

	private static Expression<Func<T, bool>> CombineExpressionsWithOr<T>(IEnumerable<Expression<Func<T, bool>>> expressions)
	{
		var queue = new Queue<Expression<Func<T, bool>>>();
		foreach (var item in expressions)
		{
			queue.Enqueue(item);
		}

		while (queue.Count > 1)
		{
			var item1 = queue.Dequeue();
			var item2 = queue.Dequeue();

			var newItem = Combine(item1, item2);
			queue.Enqueue(newItem);
		}

		return queue.Dequeue();
	}

For the implementation of Combine we can refer to this Stack Overflow question to get the following:

	private static Expression<Func<T, bool>> Combine<T>(Expression<Func<T, bool>> expr1, Expression<Func<T, bool>> expr2)
	{
		var body = Expression.OrElse(expr1.Body, expr2.Body);
		var lambda = Expression.Lambda<Func<T, bool>>(body, expr1.Parameters[0]);
		return lambda;
	}

When we run the test however we get an exception.

The critical clue for me was another Stack Overflow answer indicating that we have to use the same Parameter object throughout the LINQ query. This requires a minor refactoring to create it once and pass it around:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var param = Expression.Parameter(typeof(T), "p");
		var list = keys.Select(key => CreateLambda(item, key, param)).ToList();
		var lambda = CombineExpressionsWithOr(list);
		return lambda;
	}

	private static Expression<Func<T, bool>> CreateLambda<T>(T item, PropertyInfo key, ParameterExpression parameter)
	{
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

We also need to make a change to the verification method in the test. Since the query gets OR results we may have multiple returns (and will with the current test data) so we have to check for the expected one among them.

	private void be_able_to_find_the_employee_with_the_query()
	{
		var query = _queryExpression.Compile();
		var result = _employees
			.Where(query)
			.FirstOrDefault(x=>ReferenceEquals(x, _employee));
		result.ShouldNotBeNull();
	}

That’s it. Nothing in the QueryBuilder knows anything about Employees so we can use it generically. We could use reflection to interrogate an Employee for its search key properties then pass both to an instance of the QueryBuilder to get a LINQ query. That query could then be dynamically compiled and used to search a list of Employees that have one or more of the same property values.

One could easily extend the QueryBuilder to support AND queries.

Enjoy!

Read Full Post »

We recently had occasion to learn a bit more about type initialization in C# due to a test that started failing.

We started off with a number of NamedConstants similar to the following:

public class UserType : NamedConstant<UserType>
{
	public static readonly UserType Admin = new UserType("administrator");
	private UserType(string key)
	{
		base.Add(this,key);
	}

	public static T GetFor(string key)
	{
		return Get(key);
	}
}

The thing to note here is the members will be created by the initializer (static constructor) the first time the class is used.

NamedConstant is defined like:

public class NamedConstant<T> where T : NamedConstant<T>
{
	private static readonly Dictionary<string, T> values = new Dictionary<string, T>();

	protected void Add(string key, T item)
	{
		values.Add(key.ToLower(), item);
	}

	protected static T Get(string key)
	{
		if (key == null)
		{
			return null;
		}
		T t;
		values.TryGetValue(key.ToLower(), out t);
		return t;
	}
}

With usage being an attempt to get an instance of T by calling GetFor() as follows:

	var userType = UserType.GetFor(source);

In order to facilitate some magic with FluentNHibernate we needed to move the GetFor() method from the various classes inheriting from NamedConstant into NamedConstant itself.
A funny thing happened… after this refactoring we suddenly had a bunch of failing tests because the call to GetFor() was unexpectedly returning null. When we examined the usages we saw that we now also had Resharper hints on our calls to GetFor with the hover text Access to a static member of a type via a derived type and the suggestion to call the base class directly. Clearly calling the base static method is not magically going to get back an instance of UserType.

Some experimentation led us to the conclusion that the type initializer for UserType was not being called and as a result values in the base class was empty. We tried a couple of experiments including calling the type initializer for T from GetFor():

public static T GetFor(string key)
{
	if (values.Count == 0)
	{
		typeof(T).TypeInitializer.Invoke(null, null);
	}
	return Get(key);
}

This introduced us to another odd situation… calling the TypeInitializer in this way causes it execute twice… and that didn’t play well with our Add() method because the second time around the key would already be in the dictionary and we’d get a duplicate key exception. We definitely did not want to change the way Add works so we experimented more.

It turns out we do know something useful about classes that inherit from NamedConstant – they all have static fields (like UserType.Admin above). This means we could trigger the type initializer by getting the value of one of those fields through reflection:

var fieldInfos = typeof(T).GetFields(BindingFlags.Static | BindingFlags.Public);
fieldInfos[0].GetValue(null);

Getting an instance of T triggers the type initializer. That did the trick and now our GetFor() works like a charm. Here’s the final version:

public static T GetFor(string key)
{
	if (values.Count == 0)
	{
		try
		{
			var fieldInfos = typeof(T).GetFields(BindingFlags.Static | BindingFlags.Public);
			fieldInfos[0].GetValue(null);
		}
		catch
		{
		}
	}
	return Get(key).OrDefault();
}

For more information about type initialization check out Jon Skeet’s post.

Read Full Post »

A problem I’ve encountered a number of times when writing tests for C# code is that I need the name of a property or method so that I can invoke it through reflection in the tester. It is easy to get a property name through reflection:

public class Sample
{
	public static int Foo { get; set; }
}

[TestFixture]
public class When_asked_to_get_the_name_of_a_Property
{
	[Test]
	public void Should_be_able_to_get_the_name_using_reflection()
	{
		string name = typeof(Sample).GetProperty("Foo").Name;
		Assert.AreEqual("Foo", name);
	}

The problem is that if you rename the property you won’t discover that the code is now broken (due to the property name in the string being passed to GetProperty) until you run it. This makes for brittle tests. Another place one might want to avoid magic strings is in numerous places where the framework wants to take the name of a property to which to bind in order to fill a DataGrid, DropDownList, ArgumentException, etc:

private void BindData()
 {
 ddlState.DataSource = stateList;
 ddlState.DataTextValue = "Name";
 ddlState.DataFieldValue = "Code";
 ddlState.DataBind();
 }

As I haven’t seen a solution for this particular problem anywhere else I’m posting mine here. I use a feature of C# 3.5 and Linq to get the property name dynamically. First create an anonymous delegate to the property and save that as an Expression. Then drill into that expression to get the property name:

	[Test]
	public void Should_be_able_to_get_the_name_using_an_Expression()
	{
		Expression<Func<int>> expression = () => Sample.Foo;
		MemberExpression body = (MemberExpression)expression.Body;
		string name = body.Member.Name;
		Assert.AreEqual("Foo", name);
	}

Next refactor out a reusable method:

[Test]
	public void Should_be_able_to_get_the_name_using_an_Expression()
	{
		string name = ReflectionUtility.GetPropertyName(() => Sample.Foo);
		Assert.AreEqual("Foo", name);
	}
}

public static class ReflectionUtility
{
	public static string GetPropertyName<T>(Expression<Func<T>> expression)
	{
		MemberExpression body = (MemberExpression) expression.Body;
		return body.Member.Name;
	}
}

This solution suffers somewhat from being obscure in intent due to the lambda, a drawback I try to alleviate with a good method name. So far so good but the code gets a bit uglier when you start working with instance properties instead of static ones because you have to have an instance before you can wrap a lambda around the instance property:

public class Sample2
{
	public int Foo { get; set; }
}

	[Test]
	public void Should_be_able_to_get_the_name_using_an_Expression()
	{
		const Sample2 sample = null;
		string name = ReflectionUtility.GetPropertyName(() => sample.Foo);
		Assert.AreEqual("Foo", name);
	}

If there is no instance handy, however, this clutters the code so how about just using the default:

	[Test]
	public void Should_be_able_to_get_the_name_using_an_Expression()
	{
		string name = ReflectionUtility.GetPropertyName(() => default(Sample2).Foo);
		Assert.AreEqual("Foo", name);
	}

Unfortunately ReSharper now warns that default(Sample2) could be null. I could add comments to make ReSharper ignore the potential null reference exception here but then we’re back to ugly code again. If another method is added to the reflection utility, however:

	public static string GetPropertyName<T, TReturn>(Expression<Func<T, TReturn>> expression)
	{
		MemberExpression body = (MemberExpression)expression.Body;
		return body.Member.Name;
	}

Resharper’s null reference exception warning can be eliminated:

	[Test]
	public void Should_be_able_to_get_the_name_using_an_Expression()
	{
		string name = ReflectionUtility.GetPropertyName((Sample2 s) => s.Foo);
		Assert.AreEqual("Foo", name);
	}

This solution still isn’t particularly obvious in intent but at least we can do it all on one line without any magic strings or objects that only exist to provide access to the instance property.

Other lines of research… How about an extension method:

public static class Extensions
{
	public static string GetPropertyName<T,TReturn>(this Expression<Func<T,TReturn>> expression)
	{
		MemberExpression body = (MemberExpression)expression.Body;
		return body.Member.Name;
	}
}

	[Test]
	public void Should_be_able_to_get_the_name_using_an_Extension()
	{
		string name = ((Expression<Func<Sample2,int>>)(s => s.Foo)).GetPropertyName();
		Assert.AreEqual("Foo", name);
	}

It works but … is really ugly. I prefer the (Sample2 s) => s.Foo implementation and it looks OK when you only have it here and there in your code but a bunch in a single place really starts to obfuscate the code. So how about encapsulating the ugly part:

public class Sample2
{
	public static class BoundPropertyNames
	{
		// Resharper can collapse the following to a single visual line if so configured
		public static string Foo
		{
			get
			{
				return ReflectionUtility.GetPropertyName((Sample2 s) => s.Foo);
			}
		}
	}

	public int Foo { get; set; }
	// more properties ...
}

	[Test]
	public void Should_be_able_to_get_the_name_using_BoundPropertyNames()
	{
		string name = Sample2.BoundPropertyNames.Foo;
		Assert.AreEqual("Foo", name);
	}

Nifty! And as a bonus the compiler can probably boil it all down to a const… since we can boil it down to a readonly string if we really want to… check this out:

public class Sample2
{
	public static class BoundPropertyNames
	{
		public static readonly string Foo = ((MemberExpression)((Expression<Func<Sample2, int>>)(s => s.Foo)).Body).Member.Name;
	}

	public int Foo { get; set; }
}

The code for doing the same for a method is quite a bit simpler because we can get the method name from a Func directly (ignoring the limitation of getting an instance already discussed above because we usually have one):

public class Sample3
{
	public int Bar()
	{
		return 1;
	}
}

[TestFixture]
public class When_asked_to_get_the_name_of_a_Method
{
	[Test]
	public void Should_be_able_to_get_the_name_using_a_Func()
	{
		Func<int> method = default(Sample3).Bar;
		string name = method.Method.Name;
		Assert.AreEqual("Bar", name);
	}
}

Please let me know if you find a better way.

Get the latest version of this method from the MvbaCore project on github.

Read Full Post »

%d bloggers like this: