MVC and Entity Framework scaffolding is powerful, but I prefer to know what actually happens before accepting generated code. The code generated to update an entity when an edit has been done looked interesting, so I decided to have a closer look at it.
When an edit is done to an entity, the data is posted as a simple collection of form elements from the web browser. Remember that when working with MVC, we’re closer to the metal than with Web Forms and we are fully exposed to the stateless nature of the web. The gigantic hidden ViewState form field is gone, leaving us in a clean, but stateless world. When the updated data is posted to the server, the server has no more info about it than available in the request. The server doesn’t know where it came from and has to find out itself where, how and what part of it to update.
Tracking Changes
Tracking changes for objects moving across tiers, being disconnected from their original data source is always tricky. I’m still looking for the silver bullet to this problem and one step in the search is to investigate what other have done. In this post, I’ll take a closer look on how the Scaffolded code of MVC handles updates to EF Entities.
When an edit is done to an entity in an MVC project, the updated data is sent to the server as a plain collection of form elements. When it hits the server, some magic occurs before control is handed over to the controller action. The magician is the model binder, that creates an object and populates its properties with the corresponding data from the form collection. It is really not that magical. I assume that the model binder is little more than a loop over the posted fields, using reflection to assign the value to a field of the same name in the target object.
When control reaches the controller method, the data from the form is neatly loaded into an object. The problem is that the object is brand new, having no connection whatsoever to the original object used to render the view when the data was sent to the client. Neither is there any kind of connection to any DbContext that knows how to write the updated data to the database.
It’s time to have a look at the scaffolded code to see how it saves the data to the database.
public ActionResult Edit(Car car) { if (ModelState.IsValid) { db.Entry(car).State = EntityState.Modified; db.SaveChanges(); return RedirectToAction("Index"); } return View(car); } |
There is one thing that stands out as peculiar here to me: The existing data is never read from the database. How can the entity framework possibly know what fields where changed in the database – if any at all? Doing proper change tracking is a key concept of any ORM. Without change tracking there is no way to know what fields have to be updated – if any at all. In this case, where no state is transferred in the request (remember Web Forms ViewState?) and where the server keeps no state. How does EF know what to update?
The answer is that EF doesn’t care. EF just updates everything:
EXEC sp_executesql N'update [dbo].[Cars] set [BrandId] = @0, [RegistrationNumber] = @1, [TopSpeed] = @2, [Color] = @3, [Seats] = @4 where ([CarId] = @5) ',N'@0 int,@1 nvarchar(6),@2 int,@3 nvarchar(20),@4 int,@5 int', @0=7,@1=N'ABC123',@2=210,@3=N'Red',@4=5,@5=4 |
At first I was shocked that there is no checking of previous state, but the more I think of it this is probably a good way to go. More selective updates, writing only the fields that actually changed would probably be better. But to know what fields to update, the current data has to be retrieved first from the server. Exchanging one maybe-to-large update of the database with two calls (read + selective update) is a step in the wrong direction unless the data is huge. The latency penalty of another call to the database is much larger than the overhead of writing a little too much.
When the update is done in the database I assume that all writes that update anything to the existing value are discarded anyway.
Entering the Real World
For a really simple application the generated code is a viable solution. For real world applications – I doubt. I very often find that there is a need for fine grained access control or business rule validation when doing updates. A certain update might not be allowed unless the entity has a specific current state (a flag set). Only the owner of the object may updated it. That kind of decisions require the current entity values to be read first. To me, that is a more natural way of working with an ORM. Read, update, write back.
For simple updates I’ll definitely adopt the method used by the scaffolded code, but as with any tool it has to be used under the right premises. If I need better control over security, data validation or need specific updates only of the columns that actually changed I’ll stick to read-update-write.
Also if i am not mistaken, another problem is see in The generated line:
db.Entry(car).State = EntityState.Modified
Is that it Will only update the simple properties of the Car object and will not update any complex properties or collection properties that the Car class may has!
It’s not a big issue but in real world apps often the Car class may Foreaxamole contain a Property “List Owners” and the UI/Client may have added/modified this “Owners” list so at the Server your suggested “read-update-write” pattern is a MUST.