Abstract
In many domains, there exist multiple ways for an agent to achieve optimal performance. Feedback may be provided along one or more of them to aid learning. In this work, we investigate whether humans have a preference towards providing feedback along one optimal policy over the other in two gridworld domains. We find that for the domain with significant risk to exploration, 60% of our participants prefer to discourage the agent's exploration along the risky portion of the state space, while 40% state that they have no preference. We also use the interactive reinforcement learning algorithm Policy Shaping to evaluate the performance of simulated oracles with a number of feedback strategies. We find that certain domain traits, such as risk during exploration and number of optimal policies play an important role in determining the best performing feedback strategy.
Original language | English |
---|---|
Title of host publication | AAMAS 2016 - Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems |
Pages | 1455-1456 |
Number of pages | 2 |
ISBN (Electronic) | 9781450342391 |
State | Published - 2016 |
Event | 15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016 - Singapore, Singapore Duration: May 9 2016 → May 13 2016 |
Publication series
Name | Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
---|---|
ISSN (Print) | 1548-8403 |
ISSN (Electronic) | 1558-2914 |
Conference
Conference | 15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016 |
---|---|
Country/Territory | Singapore |
City | Singapore |
Period | 5/9/16 → 5/13/16 |
Bibliographical note
Publisher Copyright:Copyright © 2016, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.
Keywords
- Interactive machine learning
- Learning from critique
- Policy shaping
- Reinforcement learning
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering