Run in CI with other tools#

In order to ensure that the code in the notebook is correct, and will run as expected on a student’s computer, I run the notebook files in a CI system. This isn’t the place to teach you CI, but some hints about how I set things up might be helpful.

The main extra tool I use is nbval which is a pytest plugin for Jupyter Notebooks. It can run all the notebooks and check that the output matches what’s saved in your notebook when run in the CI. Once you have installed nbval, you can run it through pytest with:

$ pytest --nbval *.ipynb

This will both check that everything matches, but also actually run the code in the notebooks so that if you have cells which produce output files etc. used by other cells later, they will be run.

Check the nbval docs for all the details, but some useful tips are that tagging a notebook cell with nbval-ignore-output will run the cell, but ignore and mismatched output and nbval-skip will skip running that cell entirely.

There is some middle ground though by using “sanitisers” which can allow partially mismatched cell outputs to match. One common place this is helpful is in Python where the output of a cell will often be something like <matplotlib.lines.Line2D at 0x7f5ac83a53d0> where that last part is a memory address which will be different every time you run. To have nbval ignore this, you can set a sanitzer which would convert this to <matplotlib.lines.Line2D at MEMORY_ADDRESS> inside the test so it always matches:

sanitize.cfg#
[Memory addresses]
regex: (<[a-zA-Z_][0-9a-zA-Z_.]* at )(0x[0-9a-fA-F]+)(>)
replace: \1MEMORY_ADDRESS\3

You ca then run nbval with:

$ pytest --nbval --sanitize-with sanitize.cfg *.ipynb

Other useful sanitzers are Windows/UNIX line ending normalisation and standardising the output of the %%writefile magic:

sanitize.cfg#
[newlines]
regex: \r\n
replace: \n

[Memory addresses]
regex: (<[a-zA-Z_][0-9a-zA-Z_.]* at )(0x[0-9a-fA-F]+)(>)
replace: \1MEMORY_ADDRESS\3

[writefile magic]
regex: ^Overwriting
replace: Writing

GitLab#

An example output for GitLab’s CI system is the following which does:

  • Installs the requirements from requirements.txt (e.g. numpy, matplotlib etc.) and requirements-dev.txt (e.g. nbval, nbpretty etc.)

  • Run the code cells and checks the tests

  • Outputs the results of the tests to the GitLab UI

  • Runs nbpretty and copies the outputs to an appropriate place

  • Publishes the output to GitLab Pages

.gitlab-ci.yml#
image: python:3.8

before_script:
  - pip install -r requirements-dev.txt -r requirements.txt

test:
  stage: test
  script:
    - pytest --nbval --sanitize-with sanitize.cfg --junit-xml=rspec.xml *.ipynb
  artifacts:
    reports:
      junit: rspec.xml

pages:
  stage: deploy
  script:
    - nbpretty .
    - mkdir -p public
    - cp *.{html,css,png,svg} public/ || true
  artifacts:
    paths:
      - public
  only:
    - master