SQL Server views can get really weird when an underlying table is updated. I’ll show some examples and also the one must do for views to avoid problems.
The broken views problem is something I discovered when working on a production system. When I first observed it I was shocked. When I dug deeper into it I was even more shocked at the pure evilness this behaviour could cause. But let’s take it from the start with a simple table of colours and a view on top of it to start with.
CREATE TABLE [dbo].[Colours] ( [Id] INT IDENTITY (1, 1) NOT NULL, [Colour] NVARCHAR (10) NOT NULL, CONSTRAINT [PK_Colours] PRIMARY KEY CLUSTERED ([Id] ASC) ); GO CREATE VIEW [dbo].[BadView] AS SELECT * FROM Colours |
Populate the table with some data and test the view.
SELECT * FROM BadView |
Id Colour ----------- ---------- 1 Red 2 Green 3 Blue 4 White 5 Black
Now, let’s break havoc to the view. I’ll recreate the table with another column RGB
which is inserted right after the Id
column. The Colour
column is renamed to ColourName
. Before running the next test, I’ve inserted the appropriate RGB values in the new column.
SELECT * FROM BadView |
Id Colour ----------- ------ 1 FF0000 2 00FF00 3 0000FF 4 FFFFFF 5 000000
This is what I consider weird behaviour. The output of the view still contains two columns with the names Id
and Colour
. That would be good – if it contained the right data. It doesn’t. The data is taken from the second column of the underlying table. If you’re not scared enough by this I have some more, even worse things to show…
When misaligned column names and data is discovered it must of course be handled. All problems can be solved in different ways. One way I found in a production system was a select clause like this.
SELECT Id, Colour AS RGB FROM BadView |
It works. It produces output with the right column name for the data. My feelings and personal thoughts of that approach are not suitable for printing out.
Id RGB ----------- ------ 1 FF0000 2 00FF00 3 0000FF 4 FFFFFF 5 000000
It was very obvious at this point that the system was brittle and that I would have to be extremely careful during the necessary clean up. Scripting out the existing definition of a view and rerunning it should be safe. Scripting a database object is a way to capture the current definition of it, to be able to restore it. It’s an no-op, right. RIGHT? So that’s what I did (on a backup copy of the DB).
ALTER VIEW [dbo].[BadView] AS SELECT * FROM Colours |
Then I once again tested the view.
SELECT * FROM BadView |
Id RGB ColourName ----------- ------ ---------- 1 FF0000 Red 2 00FF00 Green 3 0000FF Blue 4 FFFFFF White 5 000000 Black
I now was not only scared. I was horrified. The database is not just showing some really weird behaviour. It keeps information about the view, that is not included in the scripted definition of the object. Recreating it with exactly the same script won’t give the same result.
To me this is a horror story. Walking down the road of legacy code on the way to maintenance hell.
Time to look at the cure for this weirdness.
The SCHEMABINDING cure
When I created the BadView
I also created a GoodView
that applied WITH SCHEMABINDING
. When a view is schema bound all objects (tables, functions) that the view is dependent on are locked against modification.
First I tried to create the GoodView
just like BadView
.
CREATE VIEW [dbo].[GoodView] WITH SCHEMABINDING AS SELECT * FROM Colours |
That failed.
Msg 1054, Level 15, State 6, Procedure GoodView, Line 3 Syntax '*' is not allowed in schema-bound objects.
With schema binding, SQL server requires a more specific and safer select statement.
CREATE VIEW [dbo].[GoodView] WITH SCHEMABINDING AS SELECT Id, Colour FROM dbo.Colours |
When doing the rename of the Colour
column to ColourName
the rename operation fails.
Msg 5074, Level 16, State 1, Line 3 The object 'GoodView' is dependent on column 'ColourName'. Msg 4922, Level 16, State 9, Line 3 ALTER TABLE DROP COLUMN ColourName failed because one or more objects access this column.
The message is from the SQL server engine itself. The schemabound view is protected from breaking changes of the underlying table regardless of what client tool is used.
Tooling Support is Required
I found an article: SCHEMABINDING – use or not to use that discusses the consequences of schema binding. I think that the points are valid, but it fails to mention that good tooling support can handle the problems by automatically dropping and recreating any views that are schema bound to a table that is changed. The database deployment support in Visual Studio 2010 and 2012 (much improved in 2012) is an example of a tool that handles this.
What about stored procedures?
So far this post has been about views. Probably more common than views are stored procedures. They at least doesn’t use stale column definitions.
CREATE PROCEDURE [dbo].[MyProcedure] AS SELECT * FROM Colours |
When the Colours
table is updated, the stored procedure output is adjusted to include the new column and the new name of the renamed column.
Recommendations
My recommendation is to schema bind all views as the default. If it becomes to hard to change the underlying tables when there are multiple schema bound views a better tool is needed. Dropping the schema binding is not a good solution. I prefer views over stored procedures for simple queries as they can be reused in other queries. There used to be a performance benefit for stored procedures but in modern SQL server versions all queries are compiled and the query plan cached.
First of whole.. I find very bad practice to create view with select * from table.
That’s the whole point of view – that you are representing strictly defined columns and not everything.
You’re definitely right that
select *
in a view is very bad. Even if you avoid it, using schemabinding is still a good idea because it will save you from breaking changes to the referenced columns in the underlying table.So, if you do not use asterisk (*) then you do not need
SCHEMABINDING
.(and if you use
SCHEMABINDING
, you cannot use asterisk (*), anywaySo, what’s the point here in
SCHEMABINDING
?There is still a point:
SCHEMABINDING
locks the columns and tables you are using from the view from incompatible schema changes. For example, if you decide to drop a column that you think is no longer in use on a tableSCHEMABINDING
will prevent that.Not sure what the problem is with the output after the alter the badview?
Why would you not expect the output to be all columns of the underlying table?
I am not defending the user of select * in a view and think schemabinding would be useful in this situation, just curious about your reasoning
The output after altering the badview is correct. But doing a seemingly innocent alter with exactly the same definition as it was previously will change the behaviour. That is the dangerous part, that a seemingly innocent operation will indeed to changes. If the incorrect view has been in place for a while there might be application code that compensates for the incorrect column names and that code will now be broken.
Good post – the big advantage SCHEMABINDING gives you is that Views don’t get broken…
Consider a scenario where the BadView was actually declared and created in the database as
SELECT Id, Colour FROM Colours
If you then altered your Colours table (changing the name of the Colour column to ColourName) the database would be quite happy, however, the first user to execute a select against that view would be hit with an error – column does not exist. SCHEMABINDING gets around this, as Anders points out.
This problem is actually far worse than described.
I’ve just discovered the following:
Table A
field: Ax, Ay
Table B
Field: Ax, By
a view that joins By to * for table A
Works initially.
But if you add a field to A, the view gets modified.
Lets say you have Ax, Ay and add Az
The view would go from:
becomes:
(!!!) That is not a typo, A.Az as By!!!