Unlatex: results and prospects


Jonathan Fine latex is a program that compiles a source document to a PDF document. I’m writing unlatex as a program that goes in the other direction. I’m doing this with a view to

Well, next week, on Monday 17 April 1:00-5:00pm Eastern Time, we have the very first arXiv Access Forum. I’m looking forward to that. I’m most grateful to the organisers, and I hope they’re not overwhelmed. They have a massive responsibility. As do the esteemed presenters and panelists, and the many participants.

This TeX Hour is about my emerging unlatex tool for reprocessing TeX documents, to provide more accessible outputs. Here are some arXiv stats (in round numbers):

Why seconds in a month? Well, it’s approximately equal to the total number of submissions. So we can make a Fermi estimate as to how long it will take to reprocess the entire arXiv to get accessible outputs (assuming suitable software).

Suppose we have a desktop PC with 12 cores, so 24 threads, so about 20 cores doing useful work. On such a machine, if not bottlenecked, we could do the whole lot in a month provided each item takes only 20 seconds. The download might take a while, and the electricity would be about £150 (or $150).

Harder is to make a Fermi estimate for creating suitable software, and yet harder is writing and testing the software, and its we hope accessible outputs.

