Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Policy shaping in domains with multiple optimal policies

  • Himanshu Sahni
  • , Brent Harrison
  • , Kaushik Subramanian
  • , Thomas Cederborg
  • , Charles Isbell
  • , Andrea Thomaz

Producción científica: Conference contributionrevisión exhaustiva

5 Citas (Scopus)

Resumen

In many domains, there exist multiple ways for an agent to achieve optimal performance. Feedback may be provided along one or more of them to aid learning. In this work, we investigate whether humans have a preference towards providing feedback along one optimal policy over the other in two gridworld domains. We find that for the domain with significant risk to exploration, 60% of our participants prefer to discourage the agent's exploration along the risky portion of the state space, while 40% state that they have no preference. We also use the interactive reinforcement learning algorithm Policy Shaping to evaluate the performance of simulated oracles with a number of feedback strategies. We find that certain domain traits, such as risk during exploration and number of optimal policies play an important role in determining the best performing feedback strategy.

Idioma originalEnglish
Título de la publicación alojadaAAMAS 2016 - Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems
Páginas1455-1456
Número de páginas2
ISBN (versión digital)9781450342391
EstadoPublished - 2016
Evento15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016 - Singapore, Singapore
Duración: may 9 2016may 13 2016

Serie de la publicación

NombreProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
ISSN (versión impresa)1548-8403
ISSN (versión digital)1558-2914

Conference

Conference15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016
País/TerritorioSingapore
CiudadSingapore
Período5/9/165/13/16

Nota bibliográfica

Publisher Copyright:
Copyright © 2016, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.

Financiación

This work was funded under ONR grant number N000141410003

FinanciadoresNúmero del financiador
Office of Naval ResearchN000141410003

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Software
    • Control and Systems Engineering

    Huella

    Profundice en los temas de investigación de 'Policy shaping in domains with multiple optimal policies'. En conjunto forman una huella única.

    Citar esto