GitHub Actions

GitHub Actions are automation scripts that can be configured to run in response to triggered events in online GitHub repositories.

A wide range of triggering events are available, including manual triggering, push and pull requests, releases, issue related events, and schedule based cron triggers.

whilst actions can be constructed from scratch, a wide range of predefined, and free, third party actions are available on the GitHub Marketplace. These can be combined with your own action scripts or commands to define an automated workflow.

For example, an off-the-shelf action might be used to identify whether files of a particular type have been modified in a particular push or pull request, before deciding whether to run the rest of the action. This can be useful if you want to limit an action to only run if .ipynb notebook documents have been recently modified.

In this section, we will review how GitHub Actions as part of a notebook quality process, as well as part of an automated notebook release process. A variety of examples will demonstrate:

  • how to manually trigger an action;

  • how to define an action to run inside a pre-defined, Docker containerised environment;

  • how to trigger actions against push event and pull requests;

  • how to run commands only over changed files.

In terms of automated behaviours, this section will cover:

  • automated spell-checking;

  • automated notebook testing using nbval;

  • automated notebook and markdown paired document synchronisation using jupytext.

Manually Triggering Spell-Checking Actions and Generating Downloadable Action Artefacts

Let’s start with a simple example of a manually triggered action.

Such an action might be used to trigger a spell checker, for example, that is run over all files, or a specified collection of files, in a repository

In the following example, the action can be started from the Actions page of a GitHub repository by a user with appropriate permissions on the repository.

We can follow the progress of the GitHub Action as the various jobs and steps are run:

The Action report status page maintains a log of outputs generated as the action runs. We can use this to review the output report:

Since this particular also generated and uploaded an Action artefact – a zip file containing the spelling report - we can access that asset and download a copy of the report to work with offline.

For additional spell-checking actions, see for example: rojopolis/spellcheck-github-actions (a pyspelling based spell-checking action).

Checking Notebook Execution Against a Particular Containerised Environment

We can use the nbval package to check the cell outputs of one or more notebooks against an already run reference copy of each tested notebook. This allows us to check that environment updates have not jeopardised the intended notebook execution, for example.

Typically, the notebooks might be designed to work within a particular environment. We could use requirements style files to specify a list of packages that need to be installed, or we can set up the action to against a particular containerised environment.

The following GitHub Action definition script shows how we can test the execution of a set of notebooks against a pre-built Docker container, such as a container created to support a particular student module.

The action report shows each step has completed, as well as how long each step took to complete. We can see in this case that the installation test notebook cells all passed correctly.

If we have a test that fails, but have qualified a step with continue-on-error: true, we can review the failed tests without breaking the flow of the action:

We can further see the step generated an error code, but that was trapped by the continue-on-error: true setting and the action continued:

If remove the continue-on-error setting, or set continue-on-error: false (the default), the action would have failed:

If we check the report, we see the failure was blocking and prevented execution of the next step. However, certain other (housekeeping) steps that are defined to always run did execute.

Triggering Actions Based on the Presence of Particular Modified File Types

The dorny/paths-filter GitHub Action “enables conditional execution of workflow steps and jobs, based on the files modified by pull request, on a feature branch, or by the recently pushed commits.”.

We can also use the action in association with other jobs, creating the guard in one job and then referencing it in another.

Running Commands Against Changed Files

As well as allowing us to create a guard that checks whether files of a particular type have been modified, the dorny/paths-filter action can also pass the names of changed files, allowing us to act on them directly:

Synchronising Markdown and Notebook Files Using jupytext

An editor making amends directly to a markdown text document that is intended to be paired with .ipynb notebook file may well make such changes in an arbitrary text editing environment. Since such an editor is likely to be operating outside the context of a Jupyter server process running the jupytext server extension, it is quite likely that the text and .ipynb versions will be out of synch. (The use of GitHub pre-commit actions could help keep files in synch at the commit level for editors working under local git version control.)

We can define an action that will attempt to synchronise changed markdown files if they form part of a push or pull request. Specifically, if a markdown file is added or deleted, and if it contains jupytext pairing metadata, we can ensure that any other paired documents are synchronised. (If a changed markdown document is not recognised by jupytext as a paired document, it will not be synchronised.)

Using the stefanzweifel/git-auto-commit-action, we can then automatically commit any paired notebook files that that are updated as a result of the synchronisation.

Note that as it currently stands, the action will not update notebook output cells, which means that if the content of any code cells in the markdown document are modified, the notebook cell outputs may no longer represent a true output from the modified code cells.

It would be possible to run the jupytext action with an --execute switch, although the step would need to be run in an environmental context configured to supporting the execution of the notebook(s).

Creating Releases

When a release is created, the state of the repository at a particular point in time is tagged as a release and referenced via the releases page. This provides a convenient way of publishing snapshots of the repository contents at a very specific version of the files.

In addition, releases can also be annotated with downloadable file bundles that can be accessed from the release note on the releases page.

When creating bundles of files for release to students, the release mechanism, and the ability to publish downloadable file bundles via the release note on a GitHub repository’s release page, provides a convenient way of publishing such an artefact.

The following action provides an example of how to generate a release, triggered by a push tagged with a release- prefixed tag. The tm351zip command (from the innovationOUtside/nb_workflow_tools Python package) is used to create the downloadable zip file containing a set of specified notebooks with their output cells cleared. We could alternatively create a downloadable bundle where the notebook cells are all freshly run using the -r runWithErrors switch.

Having created the zip file, save a listing of the zip file contents as a file that we can then use as the body of the release post announcement.

The release itself is published using the third-party softprops/action-gh-release GitHub Action.

The bundled files can then be downloaded from the releases page (file listing not included in the following release note:

As well as simply generating a release, we could also run tests prior to the release to check that we are not releasing a broken set of notebooks, as demonstrated in the following action.

The action is triggered using a GitHub release form. The form body is used to specify the paths to files (comma separated, no spaces) that should be included in the release (originally referenced in the action as github.event.release.body).

Note

This action needs improving: currently, we create a release using the release form, and then another release is generated containing the bundled files. Ideally, the tests and the file bundling should be part of the the original release creation process in the form of a “pre-release” action?

Notebooks along the specified path are tested and then zipped for release. A release note itemising the contents of the zip file is then generated and issued as part of the release.

Inspection of the action log shows the directories under inspection for the tests and file bundling:

We can also inspect the test outputs (not shown) and a report identifying directories as they are added to the zip file:

The release note displays a listing of the files inside the zip file.

Using GitHub Actions to Process OU-XML Documents

As well as processing notebooks and text documents, we can use GitHub Actions to automate quality checks against OU-XML source documents.

The process makes use of a crude, informal document conversion utility provided by the innovationOUtside/open-ouxml-tools package that attempts to convert OU-XML documents to MyST flavoured extended markdown that can be converted to notebooks using jupytext.

Currently, there is no return conversion path from MyST Markdown to OU-XML.

Warning

A proposal for an official bidirectional OU-XML2Markdown pandoc converter was submitted to the OU Test and Learn workstream but abandoned due to an inability to treat the proposal as a JFDI activity.