GitHub Actions¶
GitHub Actions are automation scripts that can be configured to run in response to triggered events in online GitHub repositories.
A wide range of triggering events are available, including manual triggering, push and pull requests, releases, issue related events, and schedule based cron
triggers.
whilst actions can be constructed from scratch, a wide range of predefined, and free, third party actions are available on the GitHub Marketplace. These can be combined with your own action scripts or commands to define an automated workflow.
For example, an off-the-shelf action might be used to identify whether files of a particular type have been modified in a particular push or pull request, before deciding whether to run the rest of the action. This can be useful if you want to limit an action to only run if .ipynb
notebook documents have been recently modified.
In this section, we will review how GitHub Actions as part of a notebook quality process, as well as part of an automated notebook release process. A variety of examples will demonstrate:
how to manually trigger an action;
how to define an action to run inside a pre-defined, Docker containerised environment;
how to trigger actions against push event and pull requests;
how to run commands only over changed files.
In terms of automated behaviours, this section will cover:
automated spell-checking;
automated notebook testing using
nbval
;automated notebook and markdown paired document synchronisation using
jupytext
.
Manually Triggering Spell-Checking Actions and Generating Downloadable Action Artefacts¶
Let’s start with a simple example of a manually triggered action.
Such an action might be used to trigger a spell checker, for example, that is run over all files, or a specified collection of files, in a repository
In the following example, the action can be started from the Actions page of a GitHub repository by a user with appropriate permissions on the repository.
Show GitHub Action: manual spell check
Manual action to spellcheck all files using pyspelling
.
name: spelling-full-test
# First we define the Action trigger
on:
# In this case, we define a manual trigger (workflow_dispatch)
workflow_dispatch
# Define one or more jobs to run via the action
jobs:
# Set a job name
spelling-full-demo:
# Define the base os that will run the action
runs-on: ubuntu-latest
# The job will include one or more steps
steps:
# We need to get some files to work on...
# Start by checking out the master branch of the repository
- uses: actions/checkout@master
# The spelling checker requires some particular packages
- name: Install spelling packages
# Run a set of shell commands to install required packages
run: |
sudo apt-get update && sudo apt-get install -y aspell aspell-en
# The following fork of pyspelling supports notebook md and code cell filters
python3 -m pip install --upgrade https://github.com/ouseful-PR/pyspelling/archive/th-ipynb.zip
# Spell check the files
- name: spellcheck all files
# Note that we can force the use of a particular shell if required
run: |
touch typos.txt
touch .wordlist.txt
# The spellchecker typically displays to stdout
# which is visible in the action run history view
# The ipyspell.yml file defines the target file paths that the
# spellchecker is applied to
# We can also force spellchecker output into a file
pyspelling -c .ipyspell.yml | tee -a typos.txt || continue
# Display the file contents
cat typos.txt
shell: bash
# We could let the action fail on errors
# If we want that halting behaviour, comment out the following line
continue-on-error: true
# Saving the spellchecker output to a file then displaying it,
# rather than just displaying it, may seem redundant
# But we can upload the generated file to the Action report page
# and make it available for download...
- name: Upload all typos
uses: actions/upload-artifact@v2
with:
name: typos
path: |
typos.txt
We can follow the progress of the GitHub Action as the various jobs and steps are run:
The Action report status page maintains a log of outputs generated as the action runs. We can use this to review the output report:
Since this particular also generated and uploaded an Action artefact – a zip file containing the spelling report - we can access that asset and download a copy of the report to work with offline.
For additional spell-checking actions, see for example: rojopolis/spellcheck-github-actions
(a pyspelling
based spell-checking action).
Checking Notebook Execution Against a Particular Containerised Environment¶
We can use the nbval
package to check the cell outputs of one or more notebooks against an already run reference copy of each tested notebook. This allows us to check that environment updates have not jeopardised the intended notebook execution, for example.
Typically, the notebooks might be designed to work within a particular environment. We could use requirements style files to specify a list of packages that need to be installed, or we can set up the action to against a particular containerised environment.
The following GitHub Action definition script shows how we can test the execution of a set of notebooks against a pre-built Docker container, such as a container created to support a particular student module.
Show GitHub Action: test notebooks with nbval
Manual action to test notebooks in a containerised environment using nbval
.
name: nbval-example
on:
# Use a manual trigger
workflow_dispatch
jobs:
nbval-demo:
runs-on: ubuntu-latest
# Specify we want to run the tests inside a particular containerised environment
container:
image: ouvocl/vce-tm351-monolith
# Define the action steps
steps:
# Checkout the desired branch
- uses: actions/checkout@master
# Install any additional requirements into the environment
- name: Install nbval (TH edition)
run: |
python3 -m pip install --upgrade https://github.com/ouseful-PR/nbval/archive/table-test.zip
# Ensure that required services are running
- name: Restart postgres
run: |
sudo service postgresql restart
# Explicitly start an application required for testing
- name: Start mongo
run: |
# The environment variable is already defined in the container
sudo mongod --fork --logpath /dev/stdout --dbpath ${MONGO_DB_PATH}
# Run an installation test notebook
- name: TM351 installation test
run: |
py.test --nbval TM351*.ipynb
# The following line means that the action will continue
# rather than fail if this step fails
continue-on-error: true
The action report shows each step has completed, as well as how long each step took to complete. We can see in this case that the installation test notebook cells all passed correctly.
If we have a test that fails, but have qualified a step with continue-on-error: true
, we can review the failed tests without breaking the flow of the action:
We can further see the step generated an error code, but that was trapped by the continue-on-error: true
setting and the action continued:
If remove the continue-on-error
setting, or set continue-on-error: false
(the default), the action would have failed:
If we check the report, we see the failure was blocking and prevented execution of the next step. However, certain other (housekeeping) steps that are defined to always run did execute.
Triggering Actions Based on the Presence of Particular Modified File Types¶
The dorny/paths-filter
GitHub Action “enables conditional execution of workflow steps and jobs, based on the files modified by pull request, on a feature branch, or by the recently pushed commits.”.
Show GitHub Action: push or pull request involving specified files
Push or pull request action with additional filter to check file types.
name: nbval-partial-test
# Trigger the action from a push or pull request
on:
push:
pull_request:
types: [opened, edited]
jobs:
# Check that files of a particular type have changed
changed-filetype-filter:
runs-on: ubuntu-latest
# Define action steps
steps:
# Checkout
- uses: actions/checkout@v2
# Check that particular file types have been changed
- uses: dorny/paths-filter@v2
id: changes
with:
filters: |
src:
- 'src/**'
# Only run if some file in 'src' folder was changed
- if: steps.changes.outputs.src == 'true'
run: |
# These steps will only run if notebooks changed
echo "At least one .ipynb doc on src/ path was changed"
We can also use the action in association with other jobs, creating the guard in one job and then referencing it in another.
Show GitHub Action: specified file filter across jobs
Push action with additional file filter checked across multiple jobs.
name: notebook-checking
on:
push
jobs:
changes:
runs-on: ubuntu-latest
# Set job outputs to values from filter step
outputs:
notebooks: ${{ steps.filter.outputs.notebooks }}
steps:
# In a PR, we do not need to set the fetch-depth
- uses: actions/checkout@v2
- uses: dorny/paths-filter@v2
id: filter
with:
filters: |
notebooks:
- added|modified: '**.ipynb'
# This job will run if there were changed notebooks
job-for-notebooks:
needs: changes
if: ${{ needs.changes.outputs.notebooks == 'true' }}
runs-on: ubuntu-latest
steps:
#...
Running Commands Against Changed Files¶
As well as allowing us to create a guard that checks whether files of a particular type have been modified, the dorny/paths-filter
action can also pass the names of changed files, allowing us to act on them directly:
Show GitHub Action: run step over modified files
Push action that captures modified filenames, with optional filter, and then runs actions over modified files.
name: process-changes-md
on:
push
jobs:
changes-spellcheck:
runs-on: ubuntu-latest
steps:
# Checkout
- uses: actions/checkout@v2
# Test for markdown files
- uses: dorny/paths-filter@v2
id: filter
with:
# Enable listing of files matching each filter.
# Paths to files will be available in `${FILTER_NAME}_files` output variable.
# Paths will be escaped and space-delimited.
# Output is usable as command-line argument list in Linux shell
list-files: shell
# In this example changed markdown will be spellchecked using aspell
# If we specify we are only interested in added or modified files, deleted files are ignored
filters: |
notebooks:
- added|modified: '**.md'
- name: Install packages if changed files
if: ${{ steps.filter.outputs.notebooks == 'true' }}
run: |
sudo apt-get update && sudo apt-get install -y aspell aspell-en
pip install codespell
- name: Process changed files
if: ${{ steps.filter.outputs.notebooks == 'true' }}
run: |
# If a command accepts a list of files,
# we can pass them directly
# This will cause the action to error if there is a typo
codespell ${{ steps.filter.outputs.notebooks_files }}
# Alternatively, we might iterate over files one at a time
# If we had comma separated files, we could specify separator
#IFS=","
# Generate an array of the modified file names
read -a added_modified_files <<< "${{ steps.filter.outputs.notebooks_files }}"
# Then iterate over each filename
for added_modified_file in "${added_modified_files[@]}"; do
# The || continue statement will ensure that
# even with errors, the action continues
codespell "$added_modified_file" || continue
done
shell: bash
Synchronising Markdown and Notebook Files Using jupytext
¶
An editor making amends directly to a markdown text document that is intended to be paired with .ipynb
notebook file may well make such changes in an arbitrary text editing environment. Since such an editor is likely to be operating outside the context of a Jupyter server process running the jupytext
server extension, it is quite likely that the text and .ipynb
versions will be out of synch. (The use of GitHub pre-commit actions could help keep files in synch at the commit level for editors working under local git version control.)
We can define an action that will attempt to synchronise changed markdown files if they form part of a push or pull request. Specifically, if a markdown file is added or deleted, and if it contains jupytext
pairing metadata, we can ensure that any other paired documents are synchronised. (If a changed markdown document is not recognised by jupytext
as a paired document, it will not be synchronised.)
Using the stefanzweifel/git-auto-commit-action
, we can then automatically commit any paired notebook files that that are updated as a result of the synchronisation.
Show GitHub Action: jupytext synch and auto-commit
Push action to synchronise changed Mardown files using jupytext
and commit updated notebooks.
name: jupytext-changes
on:
push
jobs:
sync-jupytext:
runs-on: ubuntu-latest
steps:
# Checkout
- uses: actions/checkout@v2
# Test for markdown
- uses: dorny/paths-filter@v2
id: filter
with:
# Enable listing of files matching each filter.
# Paths to files will be available in `${FILTER_NAME}_files` output variable.
# Paths will be escaped and space-delimited.
# Output is usable as command-line argument list in Linux shell
list-files: shell
# In this example changed markdown will be spellchecked using aspell
# If we specify we are only interested in added or modified files, deleted files are ignored
filters: |
notebooks:
- added|modified: '**.md'
# Should we also identify deleted md files
# and then try to identify (and delete) .ipynb docs otherwise paired to them?
# For example, remove .ipynb file on same path ($FILEPATH is a file with .md suffix)
# rm ${FILEPATH%.md}.ipynb
- name: Install Packages if changed files
if: ${{ steps.filter.outputs.notebooks == 'true' }}
run: |
pip install jupytext
- name: Synch changed files
if: ${{ steps.filter.outputs.notebooks == 'true' }}
run: |
# If a command accepts a list of files,
# we can pass them directly
# This will only synch files if the md doc include jupytext metadata
# and has one or more paired docs defined
# The timestamp on the synched ipynb file will be set to the
# same time as the changed markdown file
jupytext --use-source-timestamp --sync ${{ steps.filter.outputs.notebooks_files }}
# Auto commit any updated notebook files
- uses: stefanzweifel/git-auto-commit-action@v4
with:
# This would be more useful if the git hash were referenced?
commit_message: Jupytext synch - modified, paired .md files
Note that as it currently stands, the action will not update notebook output cells, which means that if the content of any code cells in the markdown document are modified, the notebook cell outputs may no longer represent a true output from the modified code cells.
It would be possible to run the jupytext
action with an --execute
switch, although the step would need to be run in an environmental context configured to supporting the execution of the notebook(s).
Creating Releases¶
When a release is created, the state of the repository at a particular point in time is tagged as a release and referenced via the releases page. This provides a convenient way of publishing snapshots of the repository contents at a very specific version of the files.
In addition, releases can also be annotated with downloadable file bundles that can be accessed from the release note on the releases page.
When creating bundles of files for release to students, the release mechanism, and the ability to publish downloadable file bundles via the release note on a GitHub repository’s release page, provides a convenient way of publishing such an artefact.
The following action provides an example of how to generate a release, triggered by a push tagged with a release-
prefixed tag. The tm351zip
command (from the innovationOUtside/nb_workflow_tools
Python package) is used to create the downloadable zip file containing a set of specified notebooks with their output cells cleared. We could alternatively create a downloadable bundle where the notebook cells are all freshly run using the -r runWithErrors
switch.
Having created the zip file, save a listing of the zip file contents as a file that we can then use as the body of the release post announcement.
The release itself is published using the third-party softprops/action-gh-release
GitHub Action.
Show GitHub Action: release tagged push to trigger release with testing
Manual action or push action with release tag that triggers a release action.
name: example-release
on:
push:
# Sequence of patterns matched against refs/tags
tags:
- 'release-*'
workflow_dispatch
jobs:
release-demo:
runs-on: ubuntu-latest
env:
# Grab a copy of the auto generated release note
steps:
- uses: actions/checkout@master
- name: Package files
run: |
# Package all files we want in the release
# We also ensure notebook outputs are cleared
tm351zip -r clearOutput -a notebooks/*.ipynb release.zip
# Add a listing of the zip file to the release note
echo "Released files:" > release-files.txt
tm351zipview release.zip >> release-files.txt
- name: Create Release
id: create_release
uses: softprops/action-gh-release@v1
# The commit must be tagged for a release to happen
# Tags can be added via Github Desktop app
# https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/managing-commits/managing-tags#creating-a-tag
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
# Use the release note file that includes the zipped file listing
# as the body of the release post announcement
body_path: release-files.txt
files: |
release.zip
The bundled files can then be downloaded from the releases page (file listing not included in the following release note:
As well as simply generating a release, we could also run tests prior to the release to check that we are not releasing a broken set of notebooks, as demonstrated in the following action.
The action is triggered using a GitHub release form. The form body is used to specify the paths to files (comma separated, no spaces) that should be included in the release (originally referenced in the action as github.event.release.body
).
Note
This action needs improving: currently, we create a release using the release form, and then another release is generated containing the bundled files. Ideally, the tests and the file bundling should be part of the the original release creation process in the form of a “pre-release” action?
Notebooks along the specified path are tested and then zipped for release. A release note itemising the contents of the zip file is then generated and issued as part of the release.
Show GitHub Action: release tagged push to trigger release with testing
Release action, push action release tag, or manual action that triggers a release after successful tests and a pre-release notebook preparation process.
name: example-test-release
on:
name: example-test-release:
release:
types:
- created
jobs:
test-release-demo:
runs-on: ubuntu-latest
# Specify the environment we are going to test out notebooks against
container:
image: ouvocl/vce-tm351-monolith
env:
RELEASE_PATHS: "${{ github.event.release.body }}"
RELEASE_NAME: "${{ github.event.release.name }}"
steps:
- uses: actions/checkout@master
# Install test dependencies
- name: Install nbval (TH edition) and workflow tools
run: |
python3 -m pip install --upgrade https://github.com//ouseful-PR/nbval/archive/table-test.zip
python3 -m pip install https://github.com/innovationOUtside/nb_workflow_tools/archive/master.zip
# Ensure test environment services are running
- name: Restart postgres
run: |
sudo service postgresql restart
- name: Start mongo
run: |
sudo mongod --fork --logpath /dev/stdout --dbpath ${MONGO_DB_PATH}
# Run tests over files to be released
- name: Test files in release path
run: |
IFS="," read -a file_paths <<< "${{ github.event.release.body }}"
ls
# Test all directories
for file_path in "${file_paths[@]}"; do
pwd
py.test --nbval "$file_path" || continue
done
shell: bash
# Uncomment the following to ignore errors and continue running the Action
#continue-on-error: true
- name: Create zipped files
run: |
IFS="," read -a file_paths <<< "${{ github.event.release.body }}"
for file_path in "${file_paths[@]}"; do
tm351zip -r clearOutput -a "$file_path" release.zip
done
echo "Release paths:" > release-files.txt
echo "${RELEASE_PATHS}" > release-files.txt
tm351zipview release.zip >> release-files.txt
shell: bash
- name: Create Release
id: create_release
uses: softprops/action-gh-release@v1
# The commit must be tagged for a release to happen
# Tags can be added via Github Desktop app
# https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/managing-commits/managing-tags#creating-a-tag
with:
tag_name: ${{ github.ref }}-files
name: ${{ github.event.release.name }} files
#body: "Release files/directories: ${RELEASE_NOTE}"
body_path: release-files.txt
files: |
release.zip
Inspection of the action log shows the directories under inspection for the tests and file bundling:
We can also inspect the test outputs (not shown) and a report identifying directories as they are added to the zip file:
The release note displays a listing of the files inside the zip file.
Using GitHub Actions to Process OU-XML Documents¶
As well as processing notebooks and text documents, we can use GitHub Actions to automate quality checks against OU-XML source documents.
The process makes use of a crude, informal document conversion utility provided by the innovationOUtside/open-ouxml-tools
package that attempts to convert OU-XML documents to MyST flavoured extended markdown that can be converted to notebooks using jupytext
.
Currently, there is no return conversion path from MyST Markdown to OU-XML.
Warning
A proposal for an official bidirectional OU-XML2Markdown pandoc
converter was submitted to the OU Test and Learn workstream but abandoned due to an inability to treat the proposal as a JFDI activity.