Wednesday, February 13, 2008

What to Expect from a Unit Test

What is and what is not a unit test is a hotly debated subject. At one end of the spectrum you have people who argue that a unit test replaces all depended-on objects with mock or fake objects so that the system under test (SUT) is tested in complete isolation. At the other end of the spectrum you have people who contend that anything written with an xUnit framework like JUnit is a unit test. And then we have everything else that falls in between the two ends of that spectrum. Rather than trying to arrive at a universally accepted definition of a unit test, I think that it may be more productive to talk about what we expect from a unit test. If we can agree upon a set of goals that we aim to achieve through the practice of unit testing, then we do not need to concern ourselves with whether or not the test that we are writing is a true unit test. Instead we can instead focus on using automated testing to facilitate the development of our software.

Rapid Feedback
A unit test should provide immediate feedback. Unit tests need to execute quickly since we (hopefully) run them over and over during development. While we work on a particular piece of code, we may choose to run a subset of the tests. We might do this through the IDE test runner. And then prior to committing code, we run the full test suite. The tests for the code we are currently working on need to be fast since we are running those tests frequently as we are writing the code. Since we want commits to be small and frequent, the full test suite needs to be fast as well if we expect it to be run prior to commits. Not only should the test suite run fast for a developer build, but it should also run fast during integration builds so that we receive timely feedback during integration.

Defect Localization
When a test fails, we should know exactly what part of the SUT caused the test to fail. There are a couple of things that can be done to help promote effective defect localization. Only one condition should be verified per test. To that end there is a school of thought in which a test should only contain a single assert statement; however, the most important thing is that the test is not overly aggressive in trying to check multiple conditions. There are times when testing a condition may require multiple assert statements. Whether you choose to break out each assert into a separate test or you choose to keep all of the asserts for that condition in the same test is really a matter of preference.

The second thing that comes into play for adequate defect localization is how well you isolate the SUT. Even if we only verify a single condition or even if we only use one assert per test, there may be times when it is not immediately obvious what part of the SUT caused the test to fail. This is often a direct result of not sufficiently isolating the SUT. There are a plenty of articles, papers, and books that discuss strategies and techniques for isolating the SUT. Some tools like mock object libraries allow you to complete isolate an object by mocking all of its neighboring objects. Here is a good rule of thumb to start with for determining on an appropriate level of isolation - the cause of the test failure can be determined without having to rely on a debugger or additional logging statements.

Executable Documentation
Tests can and should serve as documentation. They can provide a living, executable specification. Tests demonstrate how an object is expected to be used, what conditions must be satisfied for invoking a method on the object, and what kind of output to expect from that object. Because they can be such a powerful form of documentation, tests should be written in a clear, self-documenting style. Intuitive variable and method names should be used to make obvious what is being tested. Test method names should be intent-revealing. testUpdateTicket() for example does not reveal intent nearly as well as testUpdateTicketShouldAddComment().

Avoid putting complex set-up or verification code in test methods as it may obscure the intent of the tests. Instead, complex logic should be relegated to test utility methods and objects. This has a few benefits. First and most importantly, it prevents the test from getting littered with complex logic, thereby making it easier for the reader to see what the test is doing. Secondly, putting the complex logic in a test utility library makes it accessible and easy to reuse in other tests. Lastly, we can put our test utility objects in their own test harness to ensure that they have been implemented properly.

Regression Safeguard
The primary goal of testing in general is to validate that our software behaves as expected under prescribed conditions. Having a set of automated tests thast we can continually run against our software provides a great safety net for catching regressions that are introduced into our code. While unit testing alone is typically not sufficient for validating our code, it provides an excellent first level of defense that should be capable of catching most errors.

Refactoring
Unit tests should enable us to be aggressive with refactoring. Refactoring is the practice of changing the implementation of code while preserving its behavior. Our unit tests should give us the confidence that refactoring will not alter the intended behavior of our code, at least not unexpectedly. If the tests do not instill that confidence, then we need to consider whether or not the tests are reliable, thorough and effective enough. Code coverage tools can help provide some measure of the effectiveness of a test suite, although a coverage tool alone should not be be used to determine the quality and effectiveness of a test suite.

Repeatable and Reliable
What exactly does it mean for a test to be repeatable and reliable? Suppose we run a test and it passes. Then we run it again without making any changes, but this time the test fails. This would be an example of a test that is not repeatable. It could also be a strong indication that the test is using a persistent fixture that is outliving the test. With a persistent fixture, we need to be especially careful about cleaning up before/after the test so that the fixture is in a consistent state for each test run. Now consider a test that starts failing as a result of changes being made to code other than the SUT. This would be an example of an unreliable test. In these situations we need to ensure that we properly isolate the SUT so that external changes do not affect our tests.

Easy and Fast to Implement
Unit tests should be relatively easy to implement without adding a significant amount time and overhead to the overall development effort. Code that is particularly difficult to get under test may indicate a larger design issue. We should take the opportunity to look for potential design problems. And maybe the easiest, most effective way to ensure that we design for testability is to write our tests first.

As with any other software, it is imperative to refactor our test code. Using test utility methods and libraries as previously discussed will significantly reduce the amount of code that we have to write for tests as well as make tests more reliable since our test utility code can have its own test harness. Testing a single condition per test method will result in smaller tests as well. This will in turn lead to a faster turn around time with going back and forth between the main code and test code.

Conclusion
These are sound, reasonable things to expect from unit tests; however, the exact expectations may vary from team to team. For example, some teams may prefer to make extensive use of mock objects and libraries like jMock, while other teams may prefer not to use mock objects at all. The most important things are that the goals are clearly stated and agreed upon within the team and that the tests aide rather than inhibit development efforts.

No comments:

Post a Comment