Abstract
In order for robots to learn from people with no machine learning expertise, robots should learn from natural human instruction. Most machine learning techniques that incorporate explanations require people to use a limited vocabulary and provide state information, even if it is not intuitive. This paper discusses a software agent that learned to play the Mario Bros. game using explanations. Our goals to improve learning from explanations were twofold: 1) to filter explanations into advice and warnings and 2) to learn policies from sentences without state information. We used sentiment analysis to filter explanations into advice of what to do and warnings of what to avoid. We developed object-focused advice to represent what actions the agent should take when dealing with objects. A reinforcement learning agent used object-focused advice to learn policies that maximized its reward. After mitigating false negatives, using sentiment as a filter was approximately 85% accurate. object-focused advice performed better than when no advice was given, the agent learned where to apply the advice, and the agent could recover from adversarial advice. We also found the method of interaction should be designed to ease the cognitive load of the human teacher or the advice may be of poor quality.
Original language | English |
---|---|
Article number | 7742965 |
Pages (from-to) | 44-55 |
Number of pages | 12 |
Journal | IEEE Transactions on Cognitive and Developmental Systems |
Volume | 9 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2017 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- Advice
- reinforcement learning (RL)
- sentiment
ASJC Scopus subject areas
- Software
- Artificial Intelligence