Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning

A spike-timing-dependent Hebbian mechanism governs the plasticity of recurrent excitatory synapses in the neocortex: synapses that are activated a few milliseconds before a postsynaptic spike are potentiated, while those that are activated a few milliseconds after are depressed. We show that such a mechanism can implement a form of temporal difference learning for prediction of input sequences. Using a biophysical model of a cortical neuron, we show that a temporal difference rule used in conjunction with dendritic backpropagating action potentials reproduces the temporally asymmetric window of Hebbian plasticity observed physiologically. Furthermore, the size and shape of the window vary with the distance of the synapse from the soma. Using a simple example, we show how a spike-timing-based temporal difference learning rule can allow a network of neocortical neurons to predict an input a few milliseconds before the input's expected arrival.

Download Full-text

Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach

SSRN Electronic Journal ◽

10.2139/ssrn.3905379 ◽

2021 ◽

Author(s):

Yanwei Jia ◽

Xunyu Zhou

Keyword(s):

Policy Evaluation ◽

Continuous Time ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Time And Space ◽

Martingale Approach

Download Full-text

Using temporal-difference learning for multi-agent bargaining

Electronic Commerce Research and Applications ◽

10.1016/j.elerap.2007.04.001 ◽

2008 ◽

Vol 7 (4) ◽

pp. 432-442 ◽

Cited By ~ 4

Author(s):

Shiu-li Huang ◽

Fu-ren Lin

Keyword(s):

Temporal Difference ◽

Temporal Difference Learning ◽

Multi Agent

Download Full-text

Double-State-Temporal Difference Learning for Resource Provisioning in Uncertain Fog Computing Environment

10.1109/iemcon53756.2021.9623085 ◽

2021 ◽

Author(s):

Bhargavi Krishna Murthy ◽

Sajjan G Shiva

Keyword(s):

Fog Computing ◽

Resource Provisioning ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Computing Environment

Download Full-text

On the Analysis of Parameter Convergence for Temporal Difference Learning of an Exemplar Balance Problem

Towards Autonomous Robotic Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-642-23232-9_49 ◽

2011 ◽

pp. 404-405 ◽

Cited By ~ 1

Author(s):

Martin Brown ◽

Onder Tutsoy

Keyword(s):

Temporal Difference ◽

Temporal Difference Learning ◽

Balance Problem

Download Full-text

The serial blocking effect: a testbed for the neural mechanisms of temporal-difference learning

Scientific Reports ◽

10.1038/s41598-019-42244-4 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 4

Author(s):

Ashraf Mahmud ◽

Petio Petrov ◽

Guillem R. Esber ◽

Mihaela D. Iordanova

Keyword(s):

Blocking Effect ◽

Neural Mechanisms ◽

Temporal Difference ◽

Temporal Difference Learning

Download Full-text

A parallel architecture for temporal difference learning with eligibility traces

2007 50th Midwest Symposium on Circuits and Systems ◽

10.1109/mwscas.2007.4488705 ◽

2007 ◽

Author(s):

J. Turnmire ◽

I. Elhanany

Keyword(s):

Parallel Architecture ◽

Temporal Difference ◽

Temporal Difference Learning

Download Full-text

ON THE ALMOST SURE RATE OF CONVERGENCE OF TEMPORAL-DIFFERENCE LEARNING ALGORITHMS

IFAC Proceedings Volumes ◽

10.3182/20020721-6-es-1901.01147 ◽

2002 ◽

Vol 35 (1) ◽

pp. 455-460

Author(s):

Vladislav B. Tadić

Keyword(s):

Rate Of Convergence ◽

Learning Algorithms ◽

Temporal Difference ◽

Temporal Difference Learning

Download Full-text