Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Correntropy kernel temporal differences for reinforcement learning brain machine interfaces

Producción científica: Conference contributionrevisión exhaustiva

2 Citas (Scopus)

Resumen

This paper introduces a novel temporal difference algorithm to estimate a value function in reinforcement learning. This is a kernel adaptive system using a robust cost function called correntropy. We call this system correntropy kernel temporal differences (CKTD). This algorithm is integrated with Q-learning to find a proper policy (Q-learning via correntropy kernel temporal differences). The proposed method was tested with a synthetic problem, and its robustness under a changing policy was quantified. The same algorithm was applied to the decoding of a monkey's neural states in a reinforcement learning brain machine interface (RLBMI) in a center-out reaching task. The results showed the potential advantage of the proposed algorithm in the RLBMI framework.

Idioma originalEnglish
Título de la publicación alojadaProceedings of the International Joint Conference on Neural Networks
Páginas2713-2717
Número de páginas5
ISBN (versión digital)9781479914845
DOI
EstadoPublished - sept 3 2014
Evento2014 International Joint Conference on Neural Networks, IJCNN 2014 - Beijing, China
Duración: jul 6 2014jul 11 2014

Serie de la publicación

NombreProceedings of the International Joint Conference on Neural Networks

Conference

Conference2014 International Joint Conference on Neural Networks, IJCNN 2014
País/TerritorioChina
CiudadBeijing
Período7/6/147/11/14

Nota bibliográfica

Publisher Copyright:
© 2014 IEEE.

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Huella

Profundice en los temas de investigación de 'Correntropy kernel temporal differences for reinforcement learning brain machine interfaces'. En conjunto forman una huella única.

Citar esto