Merging Expression Trees to Reuse in Linq Queries

Reusing parts of LINQ queries is easily done by chaining methods returning IQueryable. Unfortunately that doesn’t cover some senarios like reusing member initialization lists. To handle such a situation I wrote a Merge extension method for select expressions.

For this post we’ll load the data required for a list of cars and a detail view for one car. When loading the details, more fields are required than when just loading the list. With the merging helper method we can reuse the query for the basic list and just extend it with the added properties.

The basic info for the list view is held in a CarBasicInfo DTO class. The data for the detail view is held in a CarExtendedInfo class which is derived from the CarBasicInfo class. A helper method contains the Select() call to map Car entities to the basic DTO. Now the Merge extension method can be used to merge an expression with the initialization of the additional fields for the extended DTO with the existing helper method for the basic fields.

private Expression<Func<Car, CarBasicInfo>> basicSelect =
    c => new CarBasicInfo
    {
        CarId = c.CarId,
        RegistrationNumber = c.RegistrationNumber
    };
 
var car = ctx.Cars.Select(basicSelect.Merge(c => new CarExtendedInfo
{
    Color = c.Color,
    BrandName = c.Brand.Name
})).Single(c => c.RegistrationNumber == "ABC123");

The RegistrationNumber field is not mentioned in the init list – it is populated by the basic select in the SelectBasicInfo helper method. To accomplish this some (non trivial) rewriting of expression trees is in the Merge method.

Null Sematics in LINQ’s Sum

C# LINQ code is not always executed as C# code. With LINQ-to-SQL or Entity Framework it is translated into SQL instead. This can bring unpleasant surprises when things that can’t happen do happen.

One day when you’re deep in the zone coding a marketing specialist shows up with a haunted look because the CEO wants some numbers right now. You know, the CEO wants it NOW as in IMMEDIATELY, within 60 seconds. Well, you’re already kicked out of the zone so let’s fix those numbers. The CEO wants to know the total number of seats in the company’s blue cars (I don’t understand why that’s relevant – but the CEO is the boss so let’s just do it (besides it makes a good example of what I want to show. All right?))

The explanation already took 15 seconds – so now you’re down to 45 seconds to produce the numbers for the CEO. Being a C#/LINQ hacker the fastest tool is of course LINQPad.

5 seconds later (you’re fast at the keyboard!) you’ve got the query done:

Cars.Where(c => c.Color == "Blue").Sum(c => c.Seats)

5 more seconds to compile and run and you’ve got the magic number – 14. The marketing specialist runs off, happy to make it within the required 60 seconds. You get back to coding. The moment when you’ve got into the zone the marketing specialist shows up again (interesting how they always show up exactly when the solution to that hard problem is within reach).

Now the CEO wants the numbers for Green cars as well to tomorrow’s meeting. As it looks like a minor thing, it’s best to do it right away. You switch back to LINQPad and change the color in the Where(...) call and hits F5. This is when everything blows up in your face with an InvalidOperationException.

The null value cannot be assigned to a member with type System.Int32 which is a non-nullable value type.

At least the deadline is more than 60 seconds this time. You’ve got all night to figure out what happened. To save you from a sleepless night, I’ll let you know both what happened and what to do about it.

Always Check Generated SQL

OR-Mappers are great for making data access easier, but they are not an excuse for learning SQL. As an application developer, I’m responsible for all the SQL queries issued by the application. If I use an ORM, it’s my responsibility to check the generated SQL.

I’ve already written another post named Always Check Generated SQL back in November last year, but this is not a duplicate. Back then I examined LINQ to SQL. In this post I test the same behaviour in Entity Framework (the code won’t be exactly the same, because my EF and LINQ to SQL test databases are not identical).

The behaviour that’s up to test is how LINQ’s group by clause is translated into SQL. The problem for the SQL generator is that LINQ is more powerful than SQL. When doing a group by in LINQ, nested collections are produced which SQL can’t handle.

from car in Cars
group car by car.Color into colors
select colors

The query produces nested collections (graphics generated in LINQPad). When faced with that query, LINQ to SQL creates a horrible SQL translation. First the colors are fetched, then the set of cars is fetched in a separate query for each car.
Entity Framework clearly has a better query generator. It creates one single query.

IQueryable Read Model Extension Methods

The normalized data model of the database is often not suitable for reading and displaying data. A separate read model used to represent all the data needed to display a page improves performance.

Defining the read model is only half the work though, to make it really usable the read model should accept queries in the same way as the write model (the DB Entities) does. In .NET/C# that means that the read model should implement IQueryable<T> to enable it to be queried with LINQ. By building the read model on top of the write model it becomes a breeze.

from c in ctx.Cars.SelectCarReadModel()
where c.Car.CarId == id
select c

A call to an extension method is all that’s needed – the query above produces a CarReadModel result, containing the additional data needed for displaying that’s not directly part of the Car entity.

Separating the Read Model

A typical architecture of a .NET web application is to use EF Code First for data access and MVC to render the web pages. The data model in the database is usually (and should be!) normalized. A normalized data model is also great for updates, but when displaying data it is not enough. E.g. in a table of cars I don’t want to display a numeric, database internal id of the car’s brand. I want to display the name of the brand. Creating a separate read model simplifies that.

Separating the read and write models are a key concept of the recently popular CQRS (Command Query Responsibility Separation) architecture. I won’t go as far as the CQRS model does, but rather show a simple way to dress the write model’s car entity with the values required for displaying.

My key objective is to get a model where I can get everything needed for rendering a view to the user in one fetch from the database, with a minimum of extra coding and mapping code.

Software Development is a Job – Coding is a Passion

I'm Anders Abel, a systems architect and developer working for Kentor in Stockholm, Sweden.

profile for Anders Abel at Stack Overflow, Q&A for professional and enthusiast programmers

The complete code for all posts is available on GitHub.

Popular Posts

Archives

Series

Powered by WordPress with the Passion for Coding theme.