hypothesis for strings – Foggy Programmer

I’ve never tried hypothesis. I recall watching a talk several years ago and thought how great an idea it is…but like with too many great ideas I witness, my lack of follow-up is disturbing.

The goal of a library like hypothesis is to do ‘property-based testing’. Random data is provided as input (random data which meets certain pre-defined specifications), and the results/output can then be tested by ensuring that certain properties are retained.

For example, suppose you create a function add(*numbers). We could write a test by building a few asserts like, assert 1 + 3 + 10 == add(1, 3, 10), but we will never get a complete list of possibilities. We might instead consider randomizing the possible inputs, create randomly-sized lists of various numbers, and then comparing them:

rand_list = [random.randint(-200, 200) for i in range(random.choice(range(10000)))]
assert sum(rand_list) == add(*rand_list)

We would probably want to add floats as well, and remember cases which failed (so we re-run the test with those edge cases), but this is what hypothesis was designed to do.

The excuse I’ve always given myself is that hypothesis really only works (or works ‘best’) when ’round-tripping’. For example, assert x == unzip(zip(x)). Here, we can randomly generate any x and can confirm that the process was successful by checking against the original input. This probably isn’t true, but it certainly feels like a good excuse.

Well, in my world of mostly text, I ran across a case in which I can’t but try hypothesis — the use case of round-tripping is too clear.

The package I’m working on — well, the ‘package’ is just a set of classes which I’ve found reason to require repeatedly across different projects — has a class which organizes a set of terms/words into a trie which can then be used to (in theory) compile more efficient patterns. I’ll eventually test performance too, but it at least provides a suitably efficient way to build regular expression patterns from a modifiable list.

Here’s my initial test (using pytest):

def test_pattern_matches_original():
    data = ['there', 'hi', 'python', 'pythons', 'hiya']
    trie = PatternTrie(*data)
    pat = re.compile(trie.pattern)
    for word in data:
        assert pat.match(word)

hypothesis relies on a given to define what’s supposed to be used as input, and in my case, I want a list of strings.

from hypothesis import given
import hypothesis.strategies as st

@given(st.lists(st.text()))
def test_pattern_matches_original_hypothesis(words):
    pattern = PatternTrie(*words).pattern
    pat = re.compile(pattern)
    for word in words:
        assert pat.match(word)

My first run fails, but it fails because I failed to handle the case of an empty list. Having no patterns passed the trie should result in an error, and so hypothesis has already caught an edge case I overlooked. I have made it throw a ValueError when no data has been provided and added a test for that case.

Having handled the len(list) == 0 case in a separate test, I have updated the decorator @given(st.lists(st.text()), min_size=1) to have a minimum length list of 1. Once again, however, hypothesis notices that I have failed to handle the functionally equivalent of a list with one element: an empty string ''. These should not be added to the data so I’ve fixed the code and explicitly add a separate test to confirm that case:

def test_assert_nonword_pattern():
    trie = PatternTrie('')
    with pytest.raises(ValueError):
        trie.pattern

Adding this test, I can now update the decorator once again: @given(st.lists(st.text()), min_size=1). Running the tests, it passes.

I’m curious what tests it ran. How can I find that out? Searching for python hypothesis gives a lot of data-sciencey results, which is not what I want. I did find the option: --hypothesis-show-statistics which gives me the following output:

test/test_patterntrie.py::test_pattern_matches_original_hypothesis:

  - 100 passing examples, 0 failing examples, 5 invalid examples
  - Typical runtimes: 0-16 ms
  - Fraction of time spent in data generation: ~ 97%
  - Stopped because settings.max_examples=100

In my tests directory, there is a directory called .hypothesis which apparently contains the files in the database storing previous failures so that they can be re-run in the future: https://hypothesis.readthedocs.io/en/latest/database.html.

I’ll have to see if there are other ways in which I can integrate hypothesis into my projects.