Testing Printed Output using `pytest`

While typically testing focuses on evaluating the output of a function (i.e., what it returns) or the state of an object at a particular point in time, there are occasions when it can be important to test the printed or logged output. This might include warnings about module, parameter, or function deprecation; messages about problematic input data or suppressed exceptions; or console output designed to navigate the user through a text user interface (TUI). My preferred module for testing has been pytest, so let’s explore how to test the output text.

Capturing `print` Output

First, let’s consider how to get print output from a simple function which prints that a process has started and prints the total time once the function completes:

import time

def main_print():
    print(f'Process started.')
    start_time = time.time()
    time.sleep(1)
    end_time = time.time()
    total_time = end_time - start_time
    print(f'Done! Process took {total_time}.')

import time

def main_print():
    print(f'Process started.')
    start_time = time.time()
    time.sleep(1)
    end_time = time.time()
    total_time = end_time - start_time
    print(f'Done! Process took {total_time}.')

And let’s start our test:

def test_capture_print():
    main_print()
    # how to get printed output?

def test_capture_print():
    main_print()
    # how to get printed output?

Now, how do we get the printed output? pytest operates with fixtures — differently scope functions that provide setup or teardown code — which can be supplied as arguments to a test function. The one we’re going to need for capturing text output is capsys. capsys with capture output printed to stdout (the default output stream for printing text) or stderr (a separate stream for error messages, warnings, etc.), and expose these through a function readouterr() which returns a CaptureResult with out and err attributes for accessing the stdout and stderr output.

The CaptureResult for our main_print function is (via print(capsys)):

CaptureResult(out='Process started.\nDone! Process took 1.0003981590270996.\n', err='')

Note that all of the lines of output have been joined (i.e., the data from the first and second print statements). Let’s fix our test:

def test_capture_print():
    main_print()
    assert 'Done!' in capsys.readouterr().out
    # passes: test_capture_print PASSES  [100%]

def test_capture_print():
    main_print()
    assert 'Done!' in capsys.readouterr().out
    # passes: test_capture_print PASSES  [100%]

Here, we test both the presence of 'Done!' within the stdout stream. However, we must be careful: readouterr() flushes/clears out all text from capsys so that after calling capsys.readouterr(), the next call to the same function will return an empty string.

@pytest.mark.xfail()  # mark this function as expected to fail
def test_capture_print_fail(capsys):
    main_print()
    assert 'Process started' in capsys.readouterr().out  # flushes captured out/err
    assert 'Done!' in capsys.readouterr().out  # `capsys.readouterr().out` returns ''
    # fails

@pytest.mark.xfail()  # mark this function as expected to fail
def test_capture_print_fail(capsys):
    main_print()
    assert 'Process started' in capsys.readouterr().out  # flushes captured out/err
    assert 'Done!' in capsys.readouterr().out  # `capsys.readouterr().out` returns ''
    # fails

The correct way to do this is retaining a pointer (i.e,. setting a variable) to CaptureResult:

def test_capture_print(capsys):
    main_print()
    cr = capsys.readouterr()
    assert 'Process started' in cr.out
    assert 'Done!' in cr.out
    assert capsys.readouterr().out == ''  # flushed with call to `readouterr()`
    # passes: test_capture_print PASSED [100%]

def test_capture_print(capsys):
    main_print()
    cr = capsys.readouterr()
    assert 'Process started' in cr.out
    assert 'Done!' in cr.out
    assert capsys.readouterr().out == ''  # flushed with call to `readouterr()`
    # passes: test_capture_print PASSED [100%]

In the above block, we’re, first of all, getting all of the printed output and storing it as cr. We can then make as many checks on this as we want on that object.

We could also choose to further separate out all of the printed messages by splitting on newlines (\n) to ensure that we’ve gotten the correct ordering:

def test_capture_print(capsys):
    main_print()
    cr = capsys.readouterr()  # collect out/err
    stdout_lines = cr.out.split('\n')
    assert stdout_lines[0] == 'Process started.'  # first print statement
    assert stdout_lines[1].startswith('Done! Process took ')  # second print statement
    assert cr.err == ''  # no `stderr` output

def test_capture_print(capsys):
    main_print()
    cr = capsys.readouterr()  # collect out/err
    stdout_lines = cr.out.split('\n')
    assert stdout_lines[0] == 'Process started.'  # first print statement
    assert stdout_lines[1].startswith('Done! Process took ')  # second print statement
    assert cr.err == ''  # no `stderr` output

Capturing Error Output

In addition to stdout, capsys provides stderr via the CaptureResult.err attribute. We tested this to ensure it was empty in the last test from the previous block — but exactly would this be used for? To be honest, I’m not sure as there seem to be better approaches for both possibilities:

Errors

First, let’s consider this function where we want to ensure that either x or y (or both) are passed to the function (and not bool() ==False) before continuing. We can add a test to make sure this works:

def main_raise(x=None, y=None):
    if not x and not y:
        raise ValueError(f'Must specify value for either x or y, but {x=} and {y=}.')
    # do something with x or y

def main_raise(x=None, y=None):
    if not x and not y:
        raise ValueError(f'Must specify value for either x or y, but {x=} and {y=}.')
    # do something with x or y

Here’s our test:

def test_raises():
    main_raise()
    # fails: ValueError: Must specify value for either x or y, but x=None and y=None.

def test_raises():
    main_raise()
    # fails: ValueError: Must specify value for either x or y, but x=None and y=None.

Well, this exception gets printed to stderr, so we might consider checking it with capsys:

def test_raises(capsys):
    with pytest.raises(ValueError) as exc_info:
        main_raise()
    assert 'but x=None and y=None.' in capsys.readouterr().err
    # fails: 
    #  Expected :''
    #  Actual   :'but x=None and y=None.'

def test_raises(capsys):
    with pytest.raises(ValueError) as exc_info:
        main_raise()
    assert 'but x=None and y=None.' in capsys.readouterr().err
    # fails: 
    #  Expected :''
    #  Actual   :'but x=None and y=None.'

Why does this not work? It’s true that an Exception will print to stderr, but this will also cause the pytest test to fail (because, naturally, an exception was thrown). Unfortunately, if we try to catch this exception (vai pytest.raises or a try-catch block), then it won’t get printed to stderr. Instead, we need to use this pattern which access the exception info:

def test_raises(capsys):
    with pytest.raises(ValueError) as exc_info:
        main_raise()
    assert exc_info.type == ValueError
    assert 'but x=None and y=None.' in str(exc_info.value)
    assert capsys.readouterr().err == ''  # nothing output to `stderr`

def test_raises(capsys):
    with pytest.raises(ValueError) as exc_info:
        main_raise()
    assert exc_info.type == ValueError
    assert 'but x=None and y=None.' in str(exc_info.value)
    assert capsys.readouterr().err == ''  # nothing output to `stderr`

Warnings

What about warnings? E.g., deprecation warnings. Here’s a function which raises a deprecation warning:

def main_warning():
    warnings.warn('Function will be removed in 2.0.0', DeprecationWarning)

def main_warning():
    warnings.warn('Function will be removed in 2.0.0', DeprecationWarning)

I would have thought this could be captured in stderr, with this:

def test_warning(capsys):
    main_warning()
    assert 'Function will be removed in 2.0.0' in capsys.readouterr().err
    # fails: 
    # Expected :''
    # Actual   :'Function will be removed in 2.0.0'

def test_warning(capsys):
    main_warning()
    assert 'Function will be removed in 2.0.0' in capsys.readouterr().err
    # fails: 
    # Expected :''
    # Actual   :'Function will be removed in 2.0.0'

The proper way to do this is similar to pytest.raises, except with pytest.warn:

def test_warning(capsys):
    with pytest.warns(DeprecationWarning) as warn:  # fails if no warning made
        main_warning()
    assert len(warn) == 1
    assert warn[0].message.args[0] == 'Function will be removed in 2.0.0'

def test_warning(capsys):
    with pytest.warns(DeprecationWarning) as warn:  # fails if no warning made
        main_warning()
    assert len(warn) == 1
    assert warn[0].message.args[0] == 'Function will be removed in 2.0.0'

Testing a TUI

Let’s build a quick text user interface and test it out using capsys. We’ll make a simple app to manage a TODO list.

First, let’s create the app and give it methods to add an item, remove an item, or list all of the current items.

class TodoListApp:
    def __init__(self):
        self.todos = []

    def add(self, item):
        self.todos.append(item)
        print(f"Added: {item}")

    def list(self):
        if not self.todos:
            print("No todos.")
        else:
            for idx, item in enumerate(self.todos, start=1):
                print(f"{idx}. {item}")

    def remove(self, idx):
        if 1 <= idx <= len(self.todos):
            removed = self.todos.pop(idx - 1)
            print(f"Removed: {removed}")
        else:
            print("Invalid index.")

class TodoListApp:
    def __init__(self):
        self.todos = []

    def add(self, item):
        self.todos.append(item)
        print(f"Added: {item}")

    def list(self):
        if not self.todos:
            print("No todos.")
        else:
            for idx, item in enumerate(self.todos, start=1):
                print(f"{idx}. {item}")

    def remove(self, idx):
        if 1 <= idx <= len(self.todos):
            removed = self.todos.pop(idx - 1)
            print(f"Removed: {removed}")
        else:
            print("Invalid index.")

We’ll skep creating the actual user interface, but instead manipulate the class and ensure it provides the proper output. Here are some basic tests we can implement, ensuring that each function call will produce the proper output. (We should also add some tests to ensure the internal state is correct, but we’ll omit that in this example).

import pytest
from app import TodoListApp


@pytest.fixture(scope='function')
def app():
    # do any setup/teardown
    return TodoListApp()

def get_stdout_lines(capsys):
    """Get stdout as list of lines"""
    return capsys.readouterr().out.split('\n')

def test_add_and_list(app, capsys):
    app.add('Buy milk')
    app.add('Read book')
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[0] == 'Added: Buy milk'
    assert stdout_lines[1] == 'Added: Read book'

    app.list()
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[0] == '1. Buy milk'
    assert stdout_lines[1] == '2. Read book'


def test_list_empty(app, capsys):
    app.list()
    cr = capsys.readouterr()
    assert cr.out == 'No todos.'


def test_remove_valid(app, capsys):
    app.add('Buy eggs')
    app.remove(1)
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[1] == 'Removed: Buy eggs'


def test_remove_invalid(app, capsys):
    app.add('A')
    app.remove(2)
    stdout_lines = get_stdout_lines(capsys)
    assert 'Invalid index.' in stdout_lines

# passes:
#  test_add_and_list PASSED                                    [ 25%]
#  test_list_empty PASSED                                      [ 50%]
#  test_remove_valid PASSED                                    [ 75%]
#  test_remove_invalid PASSED                                  [100%]

import pytest
from app import TodoListApp


@pytest.fixture(scope='function')
def app():
    # do any setup/teardown
    return TodoListApp()

def get_stdout_lines(capsys):
    """Get stdout as list of lines"""
    return capsys.readouterr().out.split('\n')

def test_add_and_list(app, capsys):
    app.add('Buy milk')
    app.add('Read book')
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[0] == 'Added: Buy milk'
    assert stdout_lines[1] == 'Added: Read book'

    app.list()
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[0] == '1. Buy milk'
    assert stdout_lines[1] == '2. Read book'


def test_list_empty(app, capsys):
    app.list()
    cr = capsys.readouterr()
    assert cr.out == 'No todos.'


def test_remove_valid(app, capsys):
    app.add('Buy eggs')
    app.remove(1)
    stdout_lines = get_stdout_lines(capsys)
    assert stdout_lines[1] == 'Removed: Buy eggs'


def test_remove_invalid(app, capsys):
    app.add('A')
    app.remove(2)
    stdout_lines = get_stdout_lines(capsys)
    assert 'Invalid index.' in stdout_lines

# passes:
#  test_add_and_list PASSED                                    [ 25%]
#  test_list_empty PASSED                                      [ 50%]
#  test_remove_valid PASSED                                    [ 75%]
#  test_remove_invalid PASSED                                  [100%]

Capturing print Output

Capturing Error Output

Errors

Warnings

Testing a TUI

Capturing `print` Output