Abstract
Character recognition models rely substantially on image datasets that maintain a balance of class samples. However, achieving a balance of classes is particularly challenging for ancient manuscript contexts as character instances may be significantly limited. In this paper, we present findings from a study that assess the efficacy of using synthetically generated character instances to augment an existing dataset of ancient Greek character images for use in machine learning models. We complement our model exploration by engaging professional papyrologists to better understand the practical opportunities afforded by synthetic instances. Our results suggest that synthetic instances improve model performance for limited character classes, and may have unexplored effects on character classes more generally. We also find that trained papyrologists are unable to distinguish between synthetic and non-synthetic images and regard synthetic instances as valuable assets for professional and educational contexts. We conclude by discussing the practical implications of our research.
Original language | English |
---|---|
Title of host publication | Proceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022 |
Editors | Luc De Raedt, Luc De Raedt |
Pages | 4973-4979 |
Number of pages | 7 |
ISBN (Electronic) | 9781956792003 |
DOIs | |
State | Published - 2022 |
Event | 31st International Joint Conference on Artificial Intelligence, IJCAI 2022 - Vienna, Austria Duration: Jul 23 2022 → Jul 29 2022 |
Publication series
Name | IJCAI International Joint Conference on Artificial Intelligence |
---|---|
ISSN (Print) | 1045-0823 |
Conference
Conference | 31st International Joint Conference on Artificial Intelligence, IJCAI 2022 |
---|---|
Country/Territory | Austria |
City | Vienna |
Period | 7/23/22 → 7/29/22 |
Bibliographical note
Publisher Copyright:© 2022 International Joint Conferences on Artificial Intelligence. All rights reserved.
ASJC Scopus subject areas
- Artificial Intelligence