Why This Edition?

March 23, 2026 · 5 min read

Editor

TL;DR Most primary sources never get published. The grant-funded model cannot fix this. A minimal, AI-assisted, independently maintained digital edition can. This is what this critical edition sets out to demonstrate.

Cavriana's letters have real historical value. But that alone does not explain why they are published here, in this form, using this method. Plenty of valuable sources never get published at all. This edition exists as much for all those sources still in the drawer as for the letters themselves.

Sources That Stay in the Drawer

Applying for grants to fund a critical edition takes months and rarely succeeds. Academic publishers want broad appeal, not specialist letter collections in sixteenth-century Italian. And there is a subtler obstacle: the assumption, common among many colleagues, that edited primary sources are less prestigious than critical monographs. Transcription is treated as labour, not scholarship. The result is that enormous quantities of historical evidence remain effectively inaccessible: present in the archive, absent from the historiography.

This matters more now than it did a generation ago. If historians do not publish their sources, those sources remain absent from scholarship on the period. But the consequences now extend further. The large language models reshaping knowledge work are trained on digitised text, and earlier periods, non-English sources, and non-canonical traditions are vastly underrepresented. Carefully encoded historical documents are precisely the kind of high-quality, human-produced data that is increasingly in demand, well beyond language models alone. If the historical record is not digitised at a pace that keeps up with the rest of our digital lives, we will find ourselves with a very shallow past.

The Grant Trap

The conventional response is to apply for funding. But this model cannot scale to the problem. It concentrates resources on projects already visible enough to attract committees, privileges well-known figures and well-resourced institutions, and creates a perverse incentive: digital humanities has become an instrument for bringing money into departments, with the digital component routinely outsourced to companies that overcharge for bespoke infrastructure nobody on the research team can maintain once the grant ends.

This is an expensive way to fail. The money spent on unmaintainable custom platforms could have funded years of archival research. It also makes digital editing the exclusive domain of wealthy institutions, while the majority of researchers on this planet work with very limited budgets. We cannot simultaneously complain about the underfunding of the humanities and waste what funding we have on infrastructure that serves no one beyond the grant period.

A Different Model

Alex Gil and others associated with the Global Outlook::Digital Humanities group have argued for minimal computing: building with the least complexity necessary. The argument is sometimes framed in moral terms, but the underlying logic is economic. Complexity has a cost, and that cost falls unevenly. A static website built on open standards requires no server, no database, and no contractor. It costs nothing to host and requires no maintenance. It can be replicated by a researcher anywhere, with any budget. That is the only model capable of genuinely democratising the digital edition of primary sources.

This project follows that model deliberately. I did not apply for grants. I opened a manuscript image and a text editor. Letters are encoded in TEI and published incrementally, following the logic of modern software development: release early, iterate in public, invite correction. A staged edition is honest about its incompleteness. It makes work available years before a conventional edition would appear, and it treats each published letter as a standing invitation for others to identify errors, suggest readings, or challenge editorial decisions.

AI as Collaborator, Not Replacement

One further development has made independent projects like this more feasible. AI assistance has substantially reduced the cost of routine maintenance: normalising metadata, checking consistency across files, and flagging anomalies — tasks that would otherwise consume time better spent on archival and editorial work.

But this introduces a genuine risk. Language models hallucinate. In a scholarly context, a plausible-looking but incorrect transcription or encoding is worse than a gap, because it is harder to detect. The problem is not that AI is unreliable in general; it is that reliability in textual editing demands specific, verifiable constraints that general-purpose models do not have.

Part of this edition's purpose is to test whether AI tools can be made reliable enough for scholarly encoding. To that end, I have developed MCP-TEI, a protocol that connects AI tools directly to the TEI guidelines and to this project's own encoding documentation. Rather than leaving models to infer correct practice from general training data, the protocol ensures that every encoding decision can be checked against the relevant specification. The results are already tangible: the model becomes a fast but supervised assistant, not an autonomous editor, and the letters published here were encoded with its help.

For those who want to do something similar, I am publishing two tutorials in The Programming Historian on precisely this workflow: how to build a minimal digital critical edition, and how to integrate AI assistance without sacrificing rigour. The argument of this edition is partly made in the encoding itself, but also by the fact that it is possible at all.

Sources That Stay in the Drawer​

The Grant Trap​

A Different Model​

AI as Collaborator, Not Replacement​

Sources That Stay in the Drawer

The Grant Trap

A Different Model

AI as Collaborator, Not Replacement