On convergence rates of game theoretic reinforcement learning algorithms

Learning to Coordinate Efficiently: A Model-based Approach

Journal of Artificial Intelligence Research ◽

10.1613/jair.1154 ◽

2003 ◽

Vol 19 ◽

pp. 11-23 ◽

Cited By ~ 15

Author(s):

R. I. Brafman ◽

M. Tennenholtz

Keyword(s):

Reinforcement Learning ◽

Simple Model ◽

Stochastic Games ◽

Convergence Rates ◽

Learning Algorithms ◽

Common Interest ◽

Model Based ◽

Optimal Value ◽

To Receive

In common-interest stochastic games all players receive an identical payoff. Players participating in such games must learn to coordinate with each other in order to receive the highest-possible value. A number of reinforcement learning algorithms have been proposed for this problem, and some have been shown to converge to good solutions in the limit. In this paper we show that using very simple model-based algorithms, much better (i.e., polynomial) convergence rates can be attained. Moreover, our model-based algorithms are guaranteed to converge to the optimal value, unlike many of the existing algorithms.

Download Full-text

Cognitive Radio Networks with Reinforcement Learning Algorithms for Spectrum Allocation: A Survey

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/211952020 ◽

2020 ◽

Vol 9 (5) ◽

pp. 8371-8384

Keyword(s):

Reinforcement Learning ◽

Cognitive Radio ◽

Cognitive Radio Networks ◽

Learning Algorithms ◽

Radio Networks ◽

Spectrum Allocation

Download Full-text

Multi-Agent Reinforcement Learning: A Review of Challenges and Applications

Applied Sciences ◽

10.3390/app11114948 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4948

Author(s):

Lorenzo Canese ◽

Gian Carlo Cardarilli ◽

Luca Di Di Nunzio ◽

Rocco Fazzolari ◽

Daniele Giardino ◽

...

Keyword(s):

Reinforcement Learning ◽

Mathematical Models ◽

Learning Algorithms ◽

Single Agent ◽

Critical Issues ◽

Multi Agent ◽

Pros And Cons ◽

Application Fields

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

Download Full-text

Benchmarking reinforcement learning algorithms for demand response applications

2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe) ◽

10.1109/isgt-europe47291.2020.9248800 ◽

2020 ◽

Author(s):

Brida V. Mbuwir ◽

Carlo Manna ◽

Fred Spiessens ◽

Geert Deconinck

Keyword(s):

Reinforcement Learning ◽

Demand Response ◽

Learning Algorithms

Download Full-text

Reinforcement Learning Algorithms: Analysis and Applications

10.1007/978-3-030-41188-6 ◽

2021 ◽

Keyword(s):

Reinforcement Learning ◽

Learning Algorithms

Download Full-text

Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control

Applied Energy ◽

10.1016/j.apenergy.2021.117164 ◽

2021 ◽

Vol 298 ◽

pp. 117164

Author(s):

Marco Biemann ◽

Fabian Scheller ◽

Xiufeng Liu ◽

Lizhen Huang

Keyword(s):

Reinforcement Learning ◽

Experimental Evaluation ◽

Learning Algorithms ◽

Model Free ◽

Hvac Control

Download Full-text

Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Algorithms ◽

10.3390/a14080226 ◽

2021 ◽

Vol 14 (8) ◽

pp. 226

Author(s):

Wenzel Pilar von Pilchau ◽

Anthony Stein ◽

Jörg Hähner

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Learning Algorithms ◽

Weighted Average ◽

Up States ◽

Experience Replay

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.

Download Full-text

Comparative Analysis of Reinforcement Learning Algorithms on TORCS Environment

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302358 ◽

2020 ◽

Keyword(s):

Reinforcement Learning ◽

Comparative Analysis ◽

Learning Algorithms

Download Full-text

Reinforcement learning versus swarm intelligence for autonomous multi-HAPS coordination

SN Applied Sciences ◽

10.1007/s42452-021-04658-6 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

Ogbonnaya Anicho ◽

Philip B. Charlesworth ◽

Gurvinder S. Baicher ◽

Atulya K. Nagar

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Swarm Intelligence ◽

Performance Indicators ◽

Convergence Rates ◽

Tuning Parameters ◽

Continuous State Space ◽

Continuous State ◽

User Coverage ◽

Better Than

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.

Download Full-text

Comparison of deep reinforcement learning algorithms: Path Search in Grid World

2021 International Conference on Electronics, Information, and Communication (ICEIC) ◽

10.1109/iceic51217.2021.9369800 ◽

2021 ◽

Author(s):

YungMin SunWoo ◽

WonChang Lee

Keyword(s):

Reinforcement Learning ◽

Learning Algorithms ◽

Path Search

Download Full-text