This online interactive books reviews some of the wide range of Python packages in particular, but also some R packages, that can be used to create rich and interactive media outputs as part of a one piece generative document Jupyter publishing system.
Whilst Python (and R) programming used may be used to generate the various outputs, the code is not required in order to view the outputs.
In this way, the publishing system may be used to provide an environment in which asset generation is scripted and reproducible, as well as providing “source code” documents that can be updated and modified in order to update original assets or generate further assets derived from or inspired by them.
The publishing system is at its most powerful when scripts are used inline in a document to create document assets (even if the scripts are hidden or removed from the final document). This allows assets to be created in context as part of the execution of the original document. In this way, we can produce publications where there is a very high degree of coupling between each component of the original document. Updating one part of the source code in the document and then reflowing the document will all dependent outputs in the document to be updated automatically.
Note that there is no need for the originating code to be included in the final rendered published work. If the originating code is retained in the final work, it may be revealed and made explicit to help explain how the asset was created, or to contextualise it, or the code might be hidden but still available in a collapsed view that can be revealed by the user if they so desire. However, it is also quite possible for the asset originating code to be removed entirely from the final rendered page view.
Generating Alternative Output Formats
Not only can the publishing process produce standalone HTML pages: static print style document formats, such
.docx or PDF documents, can also be generated. Note that not all output formats will support interactive outputs.
A wide range of examples of how to achieve particular effects are provided in the pages that follow. However, it is important to note that this is not a statement of how things MUST or even SHOULD be done.
As Norman Bodeck remarked in the publisher’s foreword to the English translation of Taiichi Ohno’s Toyota Production System: Beyond Large-Scale Production:
[F]or many years, [Taiichi Ohno] would not allow anything to be recorded about [the Toyota Production System]. He claimed it was because improvement is never-ending — and by writing it down, the process would become crystallized.
The same is true of the examples herein: they often show how not to do it. For each block of code provided to demonstrate and example, in many cases that code should be further abstracted into a single line callable function so that the next time a similar effect is required, it can be achieved more simply with just a line at a time, or the invocation of a simple bit of (new) magic.
If anyone were to comment that “the code gets in the way” of the presentation, they have missed two foundational points about this book.
First, the code in code cells need not appear in the output anyway - it exists primarily to produce the output.
Second, the code demonstrated is a “live” working out and demonstration of how to achieve a production effect. In the next iteration, the code should be callable via a simpler API or magic that wraps the original working out.
And so on. The code that appears should not have to appear in that form again, If it is useful, it should be wrapped into a package that allows it to be invoked more straightforwardly and more directly.
Single Piece Generative Document Workflow¶
Many of the ideas suggested herein can be considered in terms of lean manufacturing systems with a a few twists. The aim to to provide a self-contained, self-generating, single piece document flow in which a single document contains all the information required to produce an output document embedding a range of rich media outputs generated by the execution of scripts contained within the source document. (It is also possible for execution of the source document to generate media assets in a separate file that are then embedded back in the document. The downside then is that you have multiple objects rather than a single piece.)
Support for Arbitrary Computer Languages¶
Whilst the notebook format is fixed, the programming language used for the code that is executed from within a code cell is a matter of choice. Any of the languages supported by an off-the-shelf Jupyter kernel can be written within a notebook associated with that kernel.
If the appropriate kernel is not installed and available to a Jupyter server that executes the notebooks as the Jupyter Book is produced, code outputs can still be made available in the final publication by pre-running the notebooks in an environment where the kernel is available and then publishing the notebooks with cells run, rather than running them as part of the publishing process.
How This Online Book Was Produced¶
This book was generated from a set of Jupyter notebooks (available at https://github.com/OpenComputingLab/SubjectMatterNotebooks/), executed and converted to HTML using Jupyter Book. To minimise unnecessary re-execution of code cells, the build step uses
jupyter-cache to cache cell outputs.
The HTML book site is published using Github Pages from the
gh-pages branch of the source repository. A manually executed Github Action copies the HTML files from the main repostory
src/_build/html/ directory into the