Influencing Reinforcement Learning through Natural Language Guidance

Tasmia Tasrin, Md Sultan Al Nahian, Habarakadage Perera, Brent Harrison

Research output: Contribution to journalConference articlepeer-review

Abstract

Interactive reinforcement learning (IRL) agents use human feedback or instruction to help them learn in complex environments. Often, this feedback comes in the form of a discrete signal that’s either positive or negative. While informative, this information can be difficult to generalize on its own. In this work, we explore how natural language advice can be used to provide a richer feedback signal to a reinforcement learning agent by extending policy shaping, a well-known IRL technique. Usually policy shaping employs a human feedback policy to help an agent to learn more about how to achieve its goal. In our case, we replace this human feedback policy with policy generated based on natural language advice. We aim to inspect if the generated natural language reasoning provides support to a deep RL agent to decide its actions successfully in any given environment. So, we design our model with three networks: first one is the experience driven, next is the advice generator and third one is the advice driven. While the experience driven RL agent chooses its actions being influenced by the environmental reward, the advice driven neural network with generated feedback by the advice generator for any new state selects its actions to assist the RL agent to better policy shaping.

Original languageEnglish
JournalProceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS
Volume34
DOIs
StatePublished - 2021
Event34th International Florida Artificial Intelligence Research Society Conference, FLAIRS-34 2021 - North Miami Beach, United States
Duration: May 16 2021May 19 2021

Bibliographical note

Publisher Copyright:
© 2021by the authors. All rights reserved.

Funding

This material is based upon work supported by the National Science Foundation under Grant No. 1849231.

FundersFunder number
National Science Foundation (NSF)1849231

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Software

    Fingerprint

    Dive into the research topics of 'Influencing Reinforcement Learning through Natural Language Guidance'. Together they form a unique fingerprint.

    Cite this