Testing applications is very important, but must be creatively exercised — perhaps we can follow the wearied expression of testing being more of an art than a science? Even packages like hypothesis still require some creative initialisation. What exactly should I test? How do I test that, and only that? Perhaps there are tests that should eventually pass, but don’t know (and you won’t be able to get around to them), or some in which the actual expected outcome is unclear. These latter two cases come up in my work quite a bit.
I rely heavily on pytest
to guarantee consistency and guard against breaking changes in various rule-based NLP applications. The development process will often follow the steps of:
- Collect sample text of target outcome from training data.
- Create tests with the sample texts.
- Develop algorithm until all tests pass.
- While reviewing additional sample texts, continue adding breaking changes to the test suite.
I typical test might look something like this:
import pytest @pytest.mark.parametrize(('text', 'exp'), [ ('Example text.', 1), ('Negative example.', 0), # ('Ambiguous example.', 1), # unclear what this should be # ('A really difficult case that I cannot handle.', 0), ]) def test_something(text, exp): assert algorithm(text) == exp
In the above example, I have commented out two lines due for two different reasons: one for its ambiguity (i.e., the project has not yet defined how the algorithm should extract this case) and the other for its complexity. I don’t want to get rid of these nor lose them on some page of documentation that lives elsewhere. At the same time, I don’t want to constantly see these as errors and spend time trying to determine which are the ones that I want to handle later. The commenting-out solution, however, is also a very poor solution.
I have not spent much time absorbing pytest’s documentation — I finally absorbed what fixtures are, but still can’t figure out many good applications within the NLP space — but I skimmed Brian Okken’s new pytest book and ran into the function (among others) pytest.param
. This function allows me to individually annotate any of the parameters groups that are being passed to the test_something
function.
For example, I can mark an individual example as xfail
(expected to fail) or skip
(to skip), and the test will still be reported in the pytest output as ‘expected fail’ or ‘skipped’. Further, these allow the inclusion of a ‘reason’ argument that will explain why this example was skipped.
Modifying the example above:
import pytest @pytest.mark.parametrize(('text', 'exp'), [ ('Example text.', 1), ('Negative example.', 0), pytest.param('Ambiguous example.', 1, marks=pytest.mark.skip(reason='unclear what this should be')), pytest.param('A really difficult case that I cannot handle.', 0, marks=pytest.mark.xfail(reason='difficult: see issue 162')), ]) def test_something(text, exp): assert algorithm(text) == exp
These two cases will still be reported — useful reminders of things that still need to be resolved — but will not cause my tests to fail.