Using Combinatorial Multi-Armed Bandits to Dynamically Update Player Models in an Experience Managed Environment

Anton Vinogradov, Brent Harrison

Research output: Contribution to journalConference articlepeer-review

Abstract

Designers often treat players as having static play styles, but this is shown to not necessarily be always true. This is not an issue with games that create a relatively static experience for the player but can cause problems for games that attempt to model the player and adapt themselves to better suit the player such as those with Experience Managers (ExpMs). When an ExpM makes changes to the world it necessarily biases the game environment to the better match with what it believes that the player wants. This process limits what sorts of observations that the ExpM can take and leads to problems if and when a player suddenly shifts their own preferences leading to an outdated player model that can be slow to recover. Previously it has been shown that detecting a preference shift is possible and that the Multi-Armed Bandit (MAB) framework can be used to recover the player model, but this method had limits in how much information it could gather about the player. In this paper, we offer an improvement on recovering a player model after a preference shift after one is detected by using Combinatorial MABs (CMAB). To evaluate these claims we test our method in a text-based game environment on artificial agents and find that CMABs have a significant gain in how well they can recover a model. We also validate that our artificial agents perform similarly to how humans would by testing the task on human subjects.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3847
StatePublished - 2024
Event2024 AIIDE Workshop on Intelligent Narrative Technologies, INT 2024 - Lexington, United States
Duration: Nov 18 2024 → …

Bibliographical note

Publisher Copyright:
© 2024 Copyright for this paper by its authors.

ASJC Scopus subject areas

  • General Computer Science

Cite this