Regression tests and unlatex
- Date: 2 March 2023
- Time: 6:30 to 7:30pm (UK Time)
Jonathan Fine Regression tests are a good thing. They check that changed software still performs as expected after a change. It is a very important part of test automation. This TeX Hour is about improved regresssion tests.
And when there enough automated tests, it becomes much easier to refactor style files to provide improved implementation and additional functionality. And to do this without unknowingly changing the typesetting of existing documents.
For more information about the TeX Hour, including Zoom URL, see the About page.
By chance, today Ulrike Fischer of the LaTeX Project reported that an
l3build test broke suddenly. This was because something appeared in
a different location in the
pdftex log file. The development of the
tagged PDF functionality of LaTeX will benefit from more detailed and
less noisy regression tests.
TeX produces identical outputs from identical inputs. One of TeX’s
outputs is a
dvi file, or for PDFTeX a
much is. But there is another possible output, which is much better
for regression testing.
TeX and PDFTeX produce a
\shipout primitive to a
\box (such as a typeset page). With just a
little work, we can instead use
\showbox to produce a text
representation of the
The text representation is not useful for human readers, but it is much better for regression tests. It provides both more detail about changes and less noise about irrelevant changes.
What is unlatex?
Here’s how I see large-scale refactoring of LaTeX. Because LaTeX is so mature, there are many existing documents. So much possibilities for regression tests.
So this is what we can do. Create a large inventory of
outputs. The task then is to
\showbox outputs. In
other words, produce TeX inputs that will produce exactly these
Experience will show how useful this is. The success of
me optimism. It produces quite good HTML from LaTeX, going via
dvi. Going via
\showbox will give more information than via
and so further opportunities to recover the syntax of the original
In short, perhaps with
unlatex we can both extensively refactor
LaTeX and convert LaTeX source documents to HTML/XML.
- Regression testing (wikipedia)
- Ulrike Fischer (texlive pretest)
- The Lwarp converter (Brian Dunn)
- unlatex - Rust binding to unified-latex
- unified-latex: a JS LaTeX AST package (github)