MVC and Entity Framework scaffolding is powerful, but I prefer to know what actually happens before accepting generated code. The code generated to update an entity when an edit has been done looked interesting, so I decided to have a closer look at it.
When an edit is done to an entity, the data is posted as a simple collection of form elements from the web browser. Remember that when working with MVC, we’re closer to the metal than with Web Forms and we are fully exposed to the stateless nature of the web. The gigantic hidden ViewState form field is gone, leaving us in a clean, but stateless world. When the updated data is posted to the server, the server has no more info about it than available in the request. The server doesn’t know where it came from and has to find out itself where, how and what part of it to update.
Tracking changes for objects moving across tiers, being disconnected from their original data source is always tricky. I’m still looking for the silver bullet to this problem and one step in the search is to investigate what other have done. In this post, I’ll take a closer look on how the Scaffolded code of MVC handles updates to EF Entities.
An idempotent function gives the same result even if it is applied several times. That is exactly how a database update script should behave. It shouldn’t matter if it is run on or multiple times. The result should be the same.
A database update script should be made to first check the state of the database and then apply the changes needed. If the script is done this way, several operations can be combined into one script that works on several databases despite the databases being at different (possibly unknown) state to start with.
For the database schema itself I usually use Visual Studio 2010 database projects that handles updates automatically (in VS2012 the functionality has been changed significantly). Even with the schema updates handled automatically, there are always things that need manual handling. One common case is lookup tables that need initialization.
OR-Mappers are great for making data access easier, but they are not an excuse for learning SQL. As an application developer, I’m responsible for all the SQL queries issued by the application. If I use an ORM, it’s my responsibility to check the generated SQL.
I’ve already written another post named Always Check Generated SQL back in November last year, but this is not a duplicate. Back then I examined LINQ to SQL. In this post I test the same behaviour in Entity Framework (the code won’t be exactly the same, because my EF and LINQ to SQL test databases are not identical).
The behaviour that’s up to test is how LINQ’s
group by clause is translated into SQL. The problem for the SQL generator is that LINQ is more powerful than SQL. When doing a
group by in LINQ, nested collections are produced which SQL can’t handle.
from car in Cars
group car by car.Color into colors
The query produces nested collections (graphics generated in LINQPad). When faced with that query, LINQ to SQL creates a horrible SQL translation. First the colors are fetched, then the set of cars is fetched in a separate query for each car.
Entity Framework clearly has a better query generator. It creates one single query.
The normalized data model of the database is often not suitable for reading and displaying data. A separate read model used to represent all the data needed to display a page improves performance.
Defining the read model is only half the work though, to make it really usable the read model should accept queries in the same way as the write model (the DB Entities) does. In .NET/C# that means that the read model should implement
IQueryable<T> to enable it to be queried with LINQ. By building the read model on top of the write model it becomes a breeze.
from c in ctx.Cars.SelectCarReadModel()
where c.Car.CarId == id
A call to an extension method is all that’s needed – the query above produces a
CarReadModel result, containing the additional data needed for displaying that’s not directly part of the
One of the features of Entity Framework Code First is to automatically create the database on first access. It is a convenient approach in many cases. Unfortunately the applications I work with does not fall into one of those cases. The applications I develop typically run on a dedicated web server, with the database hosted on in a shared MSSQL environment somewhere nearby.
The shared MSSQL environment is often set up using advanced clustering and virtualization technologies to maximise performance and reliability. In such an enviromnent my code seldom have access rights enough to create a database and even if it did, automatically created databases is not an option. Databases should be created by a DBA that carefully chooses the right name of the database, the right place for data and transaction log files and applies the right growth settings for the files.
To work with the shared MSSQL environment the application shouldn’t to any automatic creation or updates to the database schema. On startup I only want the application to check two things:
- that the database exists
- that the database is compatible with the model in the code.
Only after those checks have passed I want the application to run. A custom database initalization strategy can do those checks and throw exceptions that prevent the application from running if there is no existing and compatible database.