Dataset Augmentation in Papyrology with Generative Models: A Study of Synthetic Ancient Greek Character Images

Matthew I. Swindall, Timothy Player, Ben Keener, Alex C. Williams, James H. Brusuelas, Federica Nicolardi, Marzia D'Angelo, Claudio Vergara, Michael McOsker, John F. Wallin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Character recognition models rely substantially on image datasets that maintain a balance of class samples. However, achieving a balance of classes is particularly challenging for ancient manuscript contexts as character instances may be significantly limited. In this paper, we present findings from a study that assess the efficacy of using synthetically generated character instances to augment an existing dataset of ancient Greek character images for use in machine learning models. We complement our model exploration by engaging professional papyrologists to better understand the practical opportunities afforded by synthetic instances. Our results suggest that synthetic instances improve model performance for limited character classes, and may have unexplored effects on character classes more generally. We also find that trained papyrologists are unable to distinguish between synthetic and non-synthetic images and regard synthetic instances as valuable assets for professional and educational contexts. We conclude by discussing the practical implications of our research.

Original languageEnglish
Title of host publicationProceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022
EditorsLuc De Raedt, Luc De Raedt
Pages4973-4979
Number of pages7
ISBN (Electronic)9781956792003
StatePublished - 2022
Event31st International Joint Conference on Artificial Intelligence, IJCAI 2022 - Vienna, Austria
Duration: Jul 23 2022Jul 29 2022

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference31st International Joint Conference on Artificial Intelligence, IJCAI 2022
Country/TerritoryAustria
CityVienna
Period7/23/227/29/22

Bibliographical note

Publisher Copyright:
© 2022 International Joint Conferences on Artificial Intelligence. All rights reserved.

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Dataset Augmentation in Papyrology with Generative Models: A Study of Synthetic Ancient Greek Character Images'. Together they form a unique fingerprint.

Cite this