Run in CI with other tools
Contents
Run in CI with other tools#
In order to ensure that the code in the notebook is correct, and will run as expected on a student’s computer, I run the notebook files in a CI system. This isn’t the place to teach you CI, but some hints about how I set things up might be helpful.
The main extra tool I use is nbval which is a pytest plugin for Jupyter Notebooks. It can run all the notebooks and check that the output matches what’s saved in your notebook when run in the CI. Once you have installed nbval, you can run it through pytest with:
$ pytest --nbval *.ipynb
This will both check that everything matches, but also actually run the code in the notebooks so that if you have cells which produce output files etc. used by other cells later, they will be run.
Check the nbval docs for all the details, but some useful tips are that tagging a notebook cell with nbval-ignore-output
will run the cell, but ignore and mismatched output and nbval-skip
will skip running that cell entirely.
There is some middle ground though by using “sanitisers” which can allow partially mismatched cell outputs to match.
One common place this is helpful is in Python where the output of a cell will often be something like <matplotlib.lines.Line2D at 0x7f5ac83a53d0>
where that last part is a memory address which will be different every time you run.
To have nbval ignore this, you can set a sanitzer which would convert this to <matplotlib.lines.Line2D at MEMORY_ADDRESS>
inside the test so it always matches:
[Memory addresses]
regex: (<[a-zA-Z_][0-9a-zA-Z_.]* at )(0x[0-9a-fA-F]+)(>)
replace: \1MEMORY_ADDRESS\3
You ca then run nbval with:
$ pytest --nbval --sanitize-with sanitize.cfg *.ipynb
Other useful sanitzers are Windows/UNIX line ending normalisation and standardising the output of the %%writefile
magic:
[newlines]
regex: \r\n
replace: \n
[Memory addresses]
regex: (<[a-zA-Z_][0-9a-zA-Z_.]* at )(0x[0-9a-fA-F]+)(>)
replace: \1MEMORY_ADDRESS\3
[writefile magic]
regex: ^Overwriting
replace: Writing
GitLab#
An example output for GitLab’s CI system is the following which does:
Installs the requirements from
requirements.txt
(e.g. numpy, matplotlib etc.) andrequirements-dev.txt
(e.g. nbval, nbpretty etc.)Run the code cells and checks the tests
Outputs the results of the tests to the GitLab UI
Runs nbpretty and copies the outputs to an appropriate place
Publishes the output to GitLab Pages
image: python:3.8
before_script:
- pip install -r requirements-dev.txt -r requirements.txt
test:
stage: test
script:
- pytest --nbval --sanitize-with sanitize.cfg --junit-xml=rspec.xml *.ipynb
artifacts:
reports:
junit: rspec.xml
pages:
stage: deploy
script:
- nbpretty .
- mkdir -p public
- cp *.{html,css,png,svg} public/ || true
artifacts:
paths:
- public
only:
- master