Journal article icon

Journal article

Restoring ancient text using deep learning: a case study on Greek epigraphy

Abstract:
Ancient History relies on disciplines such as Epigraphy, the study of ancient inscribed texts, for evidence of the recorded past. However, these texts, “inscriptions”, are often damaged over the centuries, and illegible parts of the text must be restored by specialists, known as epigraphists. This work presents Pythia, the first ancient text restoration model that recovers missing characters from a damaged text input using deep neural networks. Its architecture is carefully designed to handle long-term context information, and deal efficiently with missing or corrupted character and word representations. To train it, we wrote a non-trivial pipeline to convert PHI, the largest digital corpus of ancient Greek inscriptions, to machine actionable text, which we call PHI-ML. On PHI-ML, Pythia’s predictions achieve a 30.1% character error rate, compared to the 57.3% of human epigraphists. Moreover, in 73.5% of cases the ground-truth sequence was among the Top-20 hypotheses of Pythia, which effectively demonstrates the impact of this assistive method on the field of digital epigraphy, and sets the state-of-the-art in ancient text restoration.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.18653/v1/D19-1668

Authors


More by this author
Division:
Humanities Division
Department:
Classics
Sub department:
Ancient History and Classical Archaeology
Oxford college:
Wolfson College
Role:
Author
More by this author
Institution:
University of Oxford
Division:
HUMS
Department:
Classics Faculty
Sub department:
Ancient History & Classical Arch
Oxford college:
Merton College
Role:
Author
ORCID:
0000-0003-3819-8537


Keywords:
Pubs id:
pubs:1063713
UUID:
uuid:6b344f53-8bcf-40bb-91ca-2e1a08f89a88
Local pid:
pubs:1063713
Source identifiers:
1063713
Deposit date:
2019-11-21

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP