Multi-agent temporal-difference learning with linear function approximation: Weak convergence under time-varying network topologies

Temporal difference learning (TD) is a simple iterative algorithm widely used for policy evaluation in Markov reward processes. Bhandari et al. prove finite time convergence rates for TD learning with linear function approximation. The analysis follows using a key insight that establishes rigorous connections between TD updates and those of online gradient descent. In a model where observations are corrupted by i.i.d. noise, convergence results for TD follow by essentially mirroring the analysis for online gradient descent. Using an information-theoretic technique, the authors also provide results for the case when TD is applied to a single Markovian data stream where the algorithm’s updates can be severely biased. Their analysis seamlessly extends to the study of TD learning with eligibility traces and Q-learning for high-dimensional optimal stopping problems.

Download Full-text

Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation

SIAM Journal on Mathematics of Data Science ◽

10.1137/20m1311971 ◽

2021 ◽

Vol 3 (1) ◽

pp. 298-320

Author(s):

Thinh T. Doan ◽

Siva Theja Maguluri ◽

Justin Romberg

Keyword(s):

Linear Function ◽

Finite Time ◽

Function Approximation ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Time Performance ◽

Linear Function Approximation

Download Full-text

Asymptotic analysis of temporal-difference learning algorithms with linear function approximation

Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304) ◽

10.1109/cdc.1999.833350 ◽

2003 ◽

Author(s):

V. Tadic

Keyword(s):

Asymptotic Analysis ◽

Linear Function ◽

Function Approximation ◽

Learning Algorithms ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Linear Function Approximation

Download Full-text

Fast gradient-descent methods for temporal-difference learning with linear function approximation

Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09 ◽

10.1145/1553374.1553501 ◽

2009 ◽

Cited By ~ 103

Author(s):

Richard S. Sutton ◽

Hamid Reza Maei ◽

Doina Precup ◽

Shalabh Bhatnagar ◽

David Silver ◽

...

Keyword(s):

Linear Function ◽

Function Approximation ◽

Gradient Descent ◽

Descent Methods ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Gradient Descent Methods ◽

Linear Function Approximation ◽

Fast Gradient

Download Full-text

Improved Temporal Difference Methods with Linear Function Approximation

Handbook of Learning and Approximate Dynamic Programming ◽

10.1109/9780470544785.ch9 ◽

2009 ◽

Keyword(s):

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Linear Function Approximation ◽

Difference Methods ◽

Temporal Difference Methods

Download Full-text

A worst-case comparison between temporal difference and residual gradient with linear function approximation

Proceedings of the 25th international conference on Machine learning - ICML '08 ◽

10.1145/1390156.1390227 ◽

2008 ◽

Cited By ~ 5

Author(s):

Lihong Li

Keyword(s):

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Worst Case ◽

Linear Function Approximation ◽

Case Comparison

Download Full-text

Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation

2012 3rd International Workshop on Cognitive Information Processing (CIP) ◽

10.1109/cip.2012.6232901 ◽

2012 ◽

Cited By ~ 1

Author(s):

Sergio Valcarcel Macua ◽

Pavle Belanovic ◽

Santiago Zazo

Keyword(s):

Reinforcement Learning ◽

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Linear Function Approximation

Download Full-text

Using temporal-difference learning for multi-agent bargaining

Electronic Commerce Research and Applications ◽

10.1016/j.elerap.2007.04.001 ◽

2008 ◽

Vol 7 (4) ◽

pp. 432-442 ◽

Cited By ~ 4

Author(s):

Shiu-li Huang ◽

Fu-ren Lin

Keyword(s):

Temporal Difference ◽

Temporal Difference Learning ◽

Multi Agent

Download Full-text