How to Start Testing A Large Legacy Application
This session morphed into a discussion right up the other end of the spectrum, about achieving 100% test coverage, what that means exactly and whether attempting such a thing is even wise.
What does 100% test coverage actually mean?
- Type of coverage - line, branch or path.
- Tests being run - unit, integration, functional or some combination of these.
- Code being covered - all production code, perhaps with some acceptable exclusions.
We discussed one greenfields Java project, which achieved 100% branch coverage of all production code except for a thin wrapping layer around external third party libraries, running only fairly tight unit tests. Integration and functional tests were not used for coverage measurements. This was done in an attempt to enforce test driving of production code, since it would be nearly impossible to achieve this result without having done so. It succeeded in this respect, but at the cost of having a reasonably large amount of fairly brittle test code to maintain, due to the fairly tight coupling between the tests and implementation details within the production classes.
Achieving this level of coverage was made more difficult when the code needed to call through to external library code which was not designed for testability. This includes the vast majority of all third party libraries and the JDK in particular! Language constructs which make achieving full test coverage more difficult include:
- Referencing concrete classes instead of interfaces.
- Direct use of constructors, rather than factories.
- Final classes.
- Static methods.
Generally, use of any construct which makes it harder to replace real dependencies with test versions makes testing tricky. The project discussed wrapped all third party APIs using untestable constructs in a thin proxy layer which automatically translated them as follows:
- Concrete/final classes -> interfaces.
- Constructors -> choice of factory classes or methods.
- Static methods -> instance methods.
Initially this layer was hand coded and was itself excluded from the code being measured for test coverage. However, inconsistencies and untested logic began to creep into this layer. To solve this, the wrappers were instead generated at runtime using dynamic proxies implementing hand coded interfaces for the desired APIs. This later evolved into the AutoBoundary module of the Proxymatic open source project.
With such a layer, it is possible to replace/mock all third party code for the purposes of testing, making it easy to reproduce all behaviours, including hard to test exception conditions, thus making 100% coverage for the remaining production code poossible.
With all this in mind, the discussion returned to the question of testing legacy applications. Surprisingly, it turned out that the 100% coverage approach/tools could be applied in legacy situations, if you think of the existing code as third party code. We figured you could start by test driving all new code and wrapping all existing code in the proxy layer, and then gradually move old code over to the new approach piece by piece. Untangling concrete/static/final dependencies would be aided by interceding the wrapper layer at the appropriate points.