Abstract
In this paper, we explore how multi-modal video representations can be applied in an end-to-end fashion for automatically generating game commentary based on Let's Play videos using deep learning. We introduce a comprehensive pipeline that involves directly taking videos from YouTube and then using a sequence-to-sequence strategy to learn how to generate appropriate commentary. We evaluate our framework using Let's Play commentaries for the game Getting Over It with Bennet Foddy. To test the quality of the commentary generation, we apply perplexity to evaluate our language models using different input video representations to highlight different aspects of gameplay that might influence commentary.
Original language | English |
---|---|
Title of host publication | Proceedings of the 14th International Conference on the Foundations of Digital Games, FDG 2019 |
Editors | Foaad Khosmood, Johanna Pirker, Thomas Apperley, Sebastian Deterding |
ISBN (Electronic) | 9781450372176 |
DOIs | |
State | Published - Aug 26 2019 |
Event | 14th International Conference on the Foundations of Digital Games, FDG 2019 - San Luis Obispo, United States Duration: Aug 26 2019 → Aug 30 2019 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 14th International Conference on the Foundations of Digital Games, FDG 2019 |
---|---|
Country/Territory | United States |
City | San Luis Obispo |
Period | 8/26/19 → 8/30/19 |
Bibliographical note
Publisher Copyright:© 2019 ACM.
Keywords
- Commentary generation
- Multi-modality
- Sequence to sequence
ASJC Scopus subject areas
- Software
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Computer Networks and Communications