Temporary files

Note

This tutorial makes use of concepts from the Python snippets and Exceptions tutorials, although neither are central to the main points.

In order to test functions that interact with the file system, it is often necessary to construct a handful of files or directories for each test. Such tests typically use the tmp_path fixture provided by pytest to create a unique temporary directory for each test. However, that directory still needs to be filled in with the files relevant to the test. Parametrize From File (in conjunction with the tmp_files plugin) is really good at specifying how to do this.

To give an example, let’s return to the Vector class from the Getting started tutorial and add a function to parse a list Vector objects from a text file. The format is as follows: each line should have exactly two space-separated numbers which will be interpreted as the x- and y-coordinates of a vector:

vector.py
class Vector:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __eq__(self, v):
        return self.x == v.x and self.y == v.y

def load(path):
    vectors = []

    with open(path) as f:
        for i, line in enumerate(f):
            fields = line.split()
            if len(fields) != 2:
                raise IndexError(f"line {i+1}: expected 2 coordinates, got {len(fields)}")

            x, y = map(float, fields)
            v = Vector(x, y)
            vectors.append(v)

    return vectors

Below are five test cases, three for files that should load successfully and two for files that shouldn’t. The tmp_files parameters specify the files to create for each test case, as mappings of file names to file contents. Refer to the tmp_files documentation for more information on how to specify this parameter.

test_vector.nt
test_load:
  -
    id: empty-file
    tmp_files:
      vectors.txt:
    expected:
      []
  -
    id: one-vector
    tmp_files:
      vectors.txt:
        > 1 2
    expected:
      - Vector(1, 2)
  -
    id: two-vectors
    tmp_files:
      vectors.txt:
        > 1 2
        > 3 4
    expected:
      - Vector(1, 2)
      - Vector(3, 4)
  -
    id: err-too-many-coords
    tmp_files:
      vectors.txt:
        > 1 2 3
    error:
      type: IndexError
      message: line 1: expected 2 coordinates, got 3
  -
    id: err-not-float
    tmp_files:
      vectors.txt:
        > hello world
    error: ValueError

Below is the test function:

test_vector.py
import vector
import parametrize_from_file as pff

with_vec = pff.Namespace('from vector import *')

@pff.parametrize(
        schema=[
            pff.cast(expected=with_vec.eval),
            pff.error_or('expected'),
        ],
        indirect=['tmp_files'],
)
def test_load(tmp_files, expected, error):
    with error:
        assert vector.load(tmp_files / 'vectors.txt') == expected

There are a few things to note about this test function:

  • The indirect=['tmp_files'] argument is critical. This is how the tmp_files fixture knows to create the files specified by the tmp_files parameters.

  • All of the tests use the same hard-coded file name. If we had wanted the file name to be part of the test (e.g. to parse differently based on the file extension, to test the “file not found” error, etc.), we would’ve added another parameter specifying which file to load.

  • The advantage of using Parametrize From File in conjunction with tmp_files, as opposed to just using tmp_files by itself, is readability. The python syntax for specifying parameters becomes hard to read when lots of multi-line strings are used, and files tend to have multiple lines. File formats like YAML, TOML, and NestedText do not have this problem.

  • This test also checks that the appropriate exceptions are raised for malformed files. This isn’t directly relevant to the task of using temporary files in tests, but it’s an important part of testing a parser.