I’m a fan of code coverage as a way to ensure that there are covering tests. One area that I tend to rely heavily on Code Coverage for is to catch any tests that are no longer working correctly due to changes in the production code. That often works out well, but today I got betrayed by the code coverage engine.
The code that I worked on contained an if statement with a multi-step
bool IsAllWrong(int importantValue, bool b)
bool a = importantValue == GetAnswer();
bool c = false;
bool d = false;
if (!a && !b && !c && !d)
Of course I had tests that made the evaluation fail both because of
b. So what happend later was that
GetAnswer() was updated, without the test for when
importantValue being updated. Of course (my bad) that test had set
true, causing the evaluation to fail on
true to be returned. So the test passed, but not due to the thing I wanted to test. In a complex application, this is bound to happen every now and then. But usually, the code coverage scores will reveal that there is an execution path not covered. But not this time! The trustworthy code coverage analysis betrayed me!
Is it relevant to have a code coverage target? In a talk at NDC Oslo 2014 Uncle Bob said that the only reasonable goal is 100%. On the other hand Mark Seemann recently said on twitter and in a follow up blog post that “I thought it was common knowledge that nothing good comes from code coverage targets”. Those are two seemingly opposing views.
Before looking at the role of code coverage, I’d like to take a few steps back and look at the objectives of testing.
When working properly with test driven development, no production code is written unless there is a failing test driving the change. As soon as the test pass, no addition of production code is allowed until there is again a failing test. If that principle is followed, there will simply be no production code that is not covered by a test.
My objective is to achieve high code quality.
My way of achieving that is TDD.
An effect of that is that I do get full code coverage.
This is a guest post by Albin Sunnanbo sharing experiences on regression testing.
On several occasions I have worked with systems that processed lots of work items with a fairly complicated algorithm. When doing a larger rewrite of such an algorithm you want to regression test your algorithm. Of course you have a bunch of unit tests and/or integration tests that maps to your requirements, but unit tests tends to test systems from one angle and you need to complement with other test methods to test it from other angles too. We have used copy of production data to run a comprehensive regression test with just a few hours of work.
Our systems had the following workflow
- Users or imports produces some kind of work item in the system, i.e. orders.
- There is a completion phase of the work where the user commits each work item and make the result final, i.e. sends the order.
- Once each item is final the system processes the work item and produces an output that is saved in a database before it is exported to another system.
We have successfully used the following approach to regression testing for those kind of algorithms.
To reach 100% testing coverage is a dream for many teams. The metric used is code coverage for tests, but is that enough? Unfortunately not. Code line coverage is not the same as functional coverage. And it is full functional coverage that really matters in the end.
Look at a simple method that formats a string.
public static string Format(int? value)
const int defaultValue = 42;
value = defaultValue;
return "The value is " + defaultValue + ".";
There is a correct test for this method, that pass and gives 100% code coverage. Still there is a severe bug in there. Can you spot it? (If you don’t, just keep reading on, it will be obvious when another test is added later.)
I was recently made aware that some unit tests for Kentor.AuthServices were failing on non-English computers. To handle that, I set up an Azure VM with Swedish installed and made a special unit test that would run all other tests with different UI cultures.
When I first understood that I had tests that were broken when run on non-English computers I of course felt that it should be fixed. The tests should not only run with other languages to enable developers from other countries. The tests should of course be possible to be run and used to find problems if someone reports errors on computers having a special language installed. There’s quite a few places in the code with string formatting and it can differ with different cultures, causing hard to find problems.
What I did was to write a special unit test that finds all other unit tests in the code and runs them with different UI culture. The unit tests are found using LINQ and Reflection (it’s an awesome combination that’s extremely powerful) and then they are run with reflection.