Introduction
This page will go into some detail to discuss some of the design concerns and goals considered when evaluating several test infrastructures, as well as provide a comparison of several infrastructures used when evaluating the "best-fit" infrastructure for FreeBSD.
The Goals of Testing and Test Infrastructures
The goal of testing is simply put, to prove that code works according to a set of requirements and within a certain set of bounds.
Simply put, investing in testing is worthwhile a) to ensure that there aren't any bugs (or if there are any, find them before your customers/users do) and if properly written, b) testcases can reduce or possibly eliminate the need for manual testing, thus freeing up your [developers'] time to do more developing (and maybe even testing :)..!).
The golden rule for any script or infrastructure is simplification of a given, repeatable process.
The Goals of Testing on FreeBSD
The requirements for testing in FreeBSD is a bit different: FreeBSD's unique requirements with respect to how code can be freely distributed and having a sourcebase that builds and installs on multiple platforms from various host targets requires more than some testing infrastructures can permit for other technical and business reasons.
Furthermore, due to resource count on the FreeBSD project versus some other opensource or commercial projects (Linux, Microsoft), documentation and stability are a must for any consumers of the chosen infrastructure.
Testing in FreeBSD
As it stands, testing in FreeBSD today is a manual process. Some of the tests can be invoked via prove (perl Test::Harness wrapper), and others can be invoked with make, sh, etc. Most of the tests live under tools/regression, but there are also some testcases provided by third party packages that live elsewhere in the tree and have their own set of unique requirements (be they tools, how they're executed, etc).
The depth and breadth of the test suites vary depending on the component, but there are a hodgepodge of userland and kernel testcases that live in the FreeBSD source tree.
Problem Statement
Test content and structure in FreeBSD currently presents the following problems:
- The tests are very tool specific in areas (again, the bulk majority are run via perl with Test::Harness, others can only be run with a full source tree, etc).
- The output and behavior is mostly ad hoc with specific exceptions (Test::Harness for instance).
Test suites aren't linked into the build in a sane way -- thus FreeBSD (or custom derivatives of FreeBSD from downstream vendors like Cisco, EMC, Juniper, NetApp, etc) may regress functionality in their custom OSes, which results in wasted engineer hours trying to diagnose issues, and might ultimately get leaked back into FreeBSD proper when things are pushed back.
- The depth and breadth of testcases are not exhaustive; there are a number of ways where FreeBSD can do better by having more in-depth testcases to avoid functional or performance regressions whenever code is committed to the project.
Test Infrastructure Comparison
When looking at testing infrastructures employed in FreeBSD (and in projects that consume FreeBSD), there are many flavors to choose from. Each infrastructure has their pluses and minuses, and I'll highlight some of the major points of some of the more major players I have dealt with when testing FreeBSD.
ATF
Pros:
- Is fairly well thought out and designed.
- Is being actively developed.
- Has native hooks for C, C++, and sh.
- Has helpful documentation.
- Supports test isolation via forking, temporary test directory creation, etc.
- Supports limited fixtures (teardown) on a per testcase basis.
- Is BSD licensed.
- Author is a NetBSD committer (Julio Merino).
- Are some worthwhile pieces (build, etc) that could be grabbed from NetBSD and easily adapted to FreeBSD.
Neutral:
- Some of the components are being replaced with a different project named Kyua (developed by the same author) for the purpose of modularity and ease of use.
Cons:
- It's new compared to other solutions.
- It has a handful of other minor caveats (I've listed them in the Known Issues section).
CTest
Pros:
Neutral:
Cons:
Libcheck
Pros:
- Has native C hooks; can be easily adapted to C++.
- Has a number of strong features.
- Is well designed.
- Is very tried and tested.
- Supports test isolation via forking, temporary test directory creation, etc.
- Supports test fixtures (setup/teardown) on a global test suite and per testcase basis.
- A number of other nifty features.
Cons:
- Only supports C/C++; embedding testcases from other languages (Lua, Perl, Python, Ruby, etc), requires writing shims to glue Libcheck and the desired language together.
As many of the reviewers note on the SourceForge page -- the documentation for some of the more advanced features is lacking.
- Is LGPL licensed (not a showstopper, but definitely a detractor).
Is not actively developed: last release was 2 years ago, and the last SourceForge project commit was ~2 months ago as of this writing.
Test::Harness
Pros:
- Script drivable language.
- Simple: if you know how to script and you follow the format, you can easily write Test::Harness testcases.
- Test suites have some isolation (forking).
Cons:
Requires perl in order to run -- which is notoriously difficult to maintain and thus was ejected from the FreeBSD base system back in 5.0.
- Documentation is split into several modules and isn't straightforward for people that aren't used to CPAN's organization (example: I had to go through several doc pages before I found a description of how testcases are formulated).
- Testcases must mimic Test::Harness's expected format.
- Test suites are not isolated and aren't cleaned up if terminated.
- No test suite or testcase fixture support.
- Test::Harness is being replaced with Tap::Harness (which does not appear to be fully backwards compatible).
Some other test infrastructures [for C] are noted here: http://check.sourceforge.net/doc/check_html/check_2.html#SEC2 . Some other non-C supporting test infrastructures that were completely off the table as far as technology requirements or licensing are concerned that are worth mentioning that the author has dealt with (if for nothing more than perspective) are as follows:
- Java
Junit - http://www.junit.org
- Lua
Various infrastructures are noted here: http://lua-users.org/wiki/UnitTesting
- Python
pyunit - http://pyunit.sourceforge.net/
unittest - http://docs.python.org/library/unittest.html
Conclusion
After some consideration, ATF was chosen as the defacto testing infrastructure for FreeBSD, as:
- It is a solid base.
- Could be extended upon.
- Pieces could be pollenated between the FreeBSD and NetBSD projects, thus creating a more solid, tested BSD family.