Correntropy kernel temporal differences for reinforcement learning brain machine interfaces

Jihye Bae, Luis G.Sanchez Giraldo, Jose C. Principe, Joseph T. Francis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper introduces a novel temporal difference algorithm to estimate a value function in reinforcement learning. This is a kernel adaptive system using a robust cost function called correntropy. We call this system correntropy kernel temporal differences (CKTD). This algorithm is integrated with Q-learning to find a proper policy (Q-learning via correntropy kernel temporal differences). The proposed method was tested with a synthetic problem, and its robustness under a changing policy was quantified. The same algorithm was applied to the decoding of a monkey's neural states in a reinforcement learning brain machine interface (RLBMI) in a center-out reaching task. The results showed the potential advantage of the proposed algorithm in the RLBMI framework.

Original languageEnglish
Title of host publicationProceedings of the International Joint Conference on Neural Networks
Pages2713-2717
Number of pages5
ISBN (Electronic)9781479914845
DOIs
StatePublished - Sep 3 2014
Event2014 International Joint Conference on Neural Networks, IJCNN 2014 - Beijing, China
Duration: Jul 6 2014Jul 11 2014

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Conference

Conference2014 International Joint Conference on Neural Networks, IJCNN 2014
Country/TerritoryChina
CityBeijing
Period7/6/147/11/14

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Correntropy kernel temporal differences for reinforcement learning brain machine interfaces'. Together they form a unique fingerprint.

Cite this