Feeds:
Posts
Comments

Archive for the ‘Expressions’ Category

Ever wonder how to use a sample object to build a LINQ Query Expression that could find that object and others like it? The other day this Stack Overflow question caught my eye because of my experiments with MongoDB and started me to thinking about this topic. Let’s say we have an object and we want to use it as a template or pattern to find others like it in a list (e.g. find other twitter users like me). Let’s further say that we aren’t talking about a class specific implementation but a generic one. How would we do it?

There are a couple of issues:

  • determine which object properties we want to use for search
  • create the query from those properties
  • apply the query to a set of objects

I’d probably implement the first item with custom key attributes on the object properties because they are easy to change. One could also use a mapping class to target the desired keys. This would be useful if we had different sets of properties we wanted to use at different times.

The third item becomes trivial if the second provides a LINQ query so that is what we’ll do.

Let’s start with a restatement of the behavior:

When asked to build a query with a single key we should get a non-null query result and
should be able to find matching objects in a list with it.

To start with we need an object that will provide our keys for this exercise:

public class Employee
{
	public string Name { get; set; }
	public int Age { get; set; }
}

Now to restate the desired behavior as a test (using the FluentAssert BDD framework):

	[Test]
	public void Given_a_single_key()
	{
		Test.Given(_queryBuilder)
			.When(asked_to_build_a_query)
			.With(a_single_key)
			.Should(not_get_a_null_query)
			.Should(be_able_to_find_the_employee_with_the_query)
			.Verify();
	}

And start building out the requisite pieces:

	private void asked_to_build_a_query()
	{
		_queryExpression = _queryBuilder.BuildQuery(_employee, _keys);
	}

	private void a_single_key()
	{
		_keys.Add(typeof(Employee).GetProperty("Age"));
	}

	private void not_get_a_null_query()
	{
		_queryExpression.ShouldNotBeNull();
	}

	private void be_able_to_find_the_employee_with_the_query()
	{
		var query = _queryExpression.Compile();
		var result = _employees.FirstOrDefault(query);
		result.ShouldNotBeNull();
		ReferenceEquals(result, _employee).ShouldBeTrue();
	}

	[SetUp]
	public void BeforeEachTest()
	{
		_keys = new List<PropertyInfo>();
		_employee = new Employee
			{
				Name = "John",
				Age = 23
			};

		_employees = new List<Employee>
			{
				new Employee
					{
						Name = "Sarah",
						Age = 34
					},
				new Employee
					{
						Name = "John",
						Age = 55
					},
				_employee
			};
		_queryBuilder = new QueryBuilder();
	}

then stub the query builder to make it compile

public class QueryBuilder
{
	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		return null;
	}
}

Now that we have a failing test we can start building the implementation. It can be patterned from the answer to the Stack Overflow question as:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var key = keys.First();
		var parameter = Expression.Parameter(typeof(T), "p");
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

That wasn’t so hard. Now how do we handle multiple keys? First the failing test to represent the desired behavior:

	[Test]
	public void Given_multiple_keys()
	{
		Test.Given(_queryBuilder)
			.When(asked_to_build_a_query)
			.With(multiple_keys)
			.Should(not_get_a_null_query)
			.Should(be_able_to_find_the_employee_with_the_query)
			.Verify();
	}

and necessary expansion of our test class:

	private void multiple_keys()
	{
		var type = typeof(Employee);
		_keys.Add(type.GetProperty("Name"));
		_keys.Add(type.GetProperty("Age"));
	}

The test fails because there are multiple Employees named “John” in the list being searched and the expected Employee is not the first one matched. How do we change the code to make query work? For now let’s say it is an OR match on all the keys. This could be something we make configurable later. So first refactor the BuildQuery method to combine all the keys in an OR expression:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var list = keys.Select(key => CreateLambda(item, key)).ToList();
		var lambda = CombineExpressionsWithOr(list);
		return lambda;
	}

	private static Expression<Func<T, bool>> CreateLambda<T>(T item, PropertyInfo key)
	{
		var parameter = Expression.Parameter(typeof(T), "p");
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

	private static Expression<Func<T, bool>> CombineExpressionsWithOr<T>(IEnumerable<Expression<Func<T, bool>>> expressions)
	{
		return null;
	}

To OR a bunch of items together with LINQ we have to build a tree of pairs.

An easy way to do that is with a queue. Put them all in a queue then as long as there is more than one item in the queue pull two, combine them and put the result back. This makes a short, full tree.

	private static Expression<Func<T, bool>> CombineExpressionsWithOr<T>(IEnumerable<Expression<Func<T, bool>>> expressions)
	{
		var queue = new Queue<Expression<Func<T, bool>>>();
		foreach (var item in expressions)
		{
			queue.Enqueue(item);
		}

		while (queue.Count > 1)
		{
			var item1 = queue.Dequeue();
			var item2 = queue.Dequeue();

			var newItem = Combine(item1, item2);
			queue.Enqueue(newItem);
		}

		return queue.Dequeue();
	}

For the implementation of Combine we can refer to this Stack Overflow question to get the following:

	private static Expression<Func<T, bool>> Combine<T>(Expression<Func<T, bool>> expr1, Expression<Func<T, bool>> expr2)
	{
		var body = Expression.OrElse(expr1.Body, expr2.Body);
		var lambda = Expression.Lambda<Func<T, bool>>(body, expr1.Parameters[0]);
		return lambda;
	}

When we run the test however we get an exception.

The critical clue for me was another Stack Overflow answer indicating that we have to use the same Parameter object throughout the LINQ query. This requires a minor refactoring to create it once and pass it around:

	public Expression<Func<T, bool>> BuildQuery<T>(T item, IEnumerable<PropertyInfo> keys)
	{
		var param = Expression.Parameter(typeof(T), "p");
		var list = keys.Select(key => CreateLambda(item, key, param)).ToList();
		var lambda = CombineExpressionsWithOr(list);
		return lambda;
	}

	private static Expression<Func<T, bool>> CreateLambda<T>(T item, PropertyInfo key, ParameterExpression parameter)
	{
		var len = Expression.PropertyOrField(parameter, key.Name);
		var body = Expression.Equal(len, Expression.Constant(key.GetValue(item, null)));

		var lambda = Expression.Lambda<Func<T, bool>>(body, parameter);

		return lambda;
	}

We also need to make a change to the verification method in the test. Since the query gets OR results we may have multiple returns (and will with the current test data) so we have to check for the expected one among them.

	private void be_able_to_find_the_employee_with_the_query()
	{
		var query = _queryExpression.Compile();
		var result = _employees
			.Where(query)
			.FirstOrDefault(x=>ReferenceEquals(x, _employee));
		result.ShouldNotBeNull();
	}

That’s it. Nothing in the QueryBuilder knows anything about Employees so we can use it generically. We could use reflection to interrogate an Employee for its search key properties then pass both to an instance of the QueryBuilder to get a LINQ query. That query could then be dynamically compiled and used to search a list of Employees that have one or more of the same property values.

One could easily extend the QueryBuilder to support AND queries.

Enjoy!

Advertisements

Read Full Post »

%d bloggers like this: