Optimal Energy Operation Strategy for We-Energy of Energy Internet Based on Hybrid Reinforcement Learning With Human-in-the-Loop

Optimal Energy Management of Energy Internet: A Distributed Actor-Critic Reinforcement Learning Method

2020 American Control Conference (ACC) ◽

10.23919/acc45564.2020.9148019 ◽

2020 ◽

Author(s):

Yijun Cheng ◽

Jun Peng ◽

Xin Gu ◽

Fu Jiang ◽

Heng Li ◽

...

Keyword(s):

Reinforcement Learning ◽

Energy Management ◽

Learning Method ◽

Energy Internet ◽

Optimal Energy ◽

Optimal Energy Management

Download Full-text

Optimal energy management strategies for energy Internet via deep reinforcement learning approach

Applied Energy ◽

10.1016/j.apenergy.2019.01.145 ◽

2019 ◽

Vol 239 ◽

pp. 598-609 ◽

Cited By ~ 49

Author(s):

Haochen Hua ◽

Yuchao Qin ◽

Chuantong Hao ◽

Junwei Cao

Keyword(s):

Reinforcement Learning ◽

Energy Management ◽

Management Strategies ◽

Learning Approach ◽

Energy Internet ◽

Optimal Energy ◽

Optimal Energy Management

Download Full-text

Optimal Energy Routing Design in Energy Internet with Multiple Energy Routing Centers Using Artificial Neural Network-Based Reinforcement Learning Method

Applied Sciences ◽

10.3390/app9030520 ◽

2019 ◽

Vol 9 (3) ◽

pp. 520 ◽

Cited By ~ 8

Author(s):

Dan-Lu Wang ◽

Qiu-Ye Sun ◽

Yu-Yang Li ◽

Xin-Rui Liu

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Reinforcement Learning ◽

Energy Structure ◽

Energy Internet ◽

Optimal Energy ◽

Information Layer ◽

Routing Design ◽

Artificial Neural ◽

Energy Routing

In order to cope with the energy crisis, the concept of an energy internet (EI) has been proposed as a novel energy structure with high efficiency which allows full play to the advantages of multi-energy coupling. In order to adapt to the multi-energy coupled energy structure and achieve flexible conversion and interaction of multi-energy, the concept of energy routing centers (ERCs) is proposed. A two-layered structure of an ERC is established. Multi-energy conversion devices and connection ports with monitoring functions are integrated in the physical layer which allows multi-energy flow with high flexibility. As for the EI with several ERCs connected to each other, energy flows among them are managed by an energy routing controller located in the information layer. In order to improve the efficiency and reduce the operating cost and environmental cost of the proposed EI, an optimal multi-energy management-based energy routing design problem is researched. Specifically, the voltages of the ERC ports are managed to regulate the power flow on the connection lines and are restricted on account of security operations. An artificial neural network (ANN)-based reinforcement learning algorithm was proposed to manage the optimal energy routing path. Simulations were done to verify the effectiveness of the proposed method.

Download Full-text

Deep reinforcement learning assisted edge-terminal collaborative offloading algorithm of blockchain computing tasks for energy Internet

International Journal of Electrical Power & Energy Systems ◽

10.1016/j.ijepes.2021.107022 ◽

2021 ◽

Vol 131 ◽

pp. 107022

Author(s):

Siya Xu ◽

Boxian Liao ◽

Chao Yang ◽

Shaoyong Guo ◽

Bo Hu ◽

...

Keyword(s):

Reinforcement Learning ◽

Energy Internet

Download Full-text

Optimal Energy Management of a Grid-Tied Solar PV-Battery Microgrid: A Reinforcement Learning Approach

Energies ◽

10.3390/en14092700 ◽

2021 ◽

Vol 14 (9) ◽

pp. 2700

Author(s):

Grace Muriithi ◽

Sunetra Chowdhury

Keyword(s):

Renewable Energy ◽

Reinforcement Learning ◽

Energy Management ◽

Renewable Energy Sources ◽

Critical Role ◽

Winter Season ◽

Learning Approach ◽

Solar Pv ◽

Optimal Energy ◽

Optimal Energy Management

In the near future, microgrids will become more prevalent as they play a critical role in integrating distributed renewable energy resources into the main grid. Nevertheless, renewable energy sources, such as solar and wind energy can be extremely volatile as they are weather dependent. These resources coupled with demand can lead to random variations on both the generation and load sides, thus complicating optimal energy management. In this article, a reinforcement learning approach has been proposed to deal with this non-stationary scenario, in which the energy management system (EMS) is modelled as a Markov decision process (MDP). A novel modification of the control problem has been presented that improves the use of energy stored in the battery such that the dynamic demand is not subjected to future high grid tariffs. A comprehensive reward function has also been developed which decreases infeasible action explorations thus improving the performance of the data-driven technique. A Q-learning algorithm is then proposed to minimize the operational cost of the microgrid under unknown future information. To assess the performance of the proposed EMS, a comparison study between a trading EMS model and a non-trading case is performed using a typical commercial load curve and PV profile over a 24-h horizon. Numerical simulation results indicate that the agent learns to select an optimized energy schedule that minimizes energy cost (cost of power purchased from the utility and battery wear cost) in all the studied cases. However, comparing the non-trading EMS to the trading EMS model operational costs, the latter one was found to decrease costs by 4.033% in summer season and 2.199% in winter season.

Download Full-text

Comparing quantum hybrid reinforcement learning to classical methods

Human-Intelligent Systems Integration ◽

10.1007/s42454-021-00025-3 ◽

2021 ◽

Author(s):

Maximilian Moll ◽

Leonhard Kunczik

Keyword(s):

Reinforcement Learning ◽

Quantum Circuits ◽

Memory Usage ◽

Computational Power ◽

Classical Domain ◽

Q Learning ◽

Optimal Behavior ◽

Computational Speed ◽

Complex Decision ◽

Hybrid Reinforcement

AbstractIn recent history, reinforcement learning (RL) proved its capability by solving complex decision problems by mastering several games. Increased computational power and the advances in approximation with neural networks (NN) paved the path to RL’s successful applications. Even though RL can tackle more complex problems nowadays, it still relies on computational power and runtime. Quantum computing promises to solve these issues by its capability to encode information and the potential quadratic speedup in runtime. We compare tabular Q-learning and Q-learning using either a quantum or a classical approximation architecture on the frozen lake problem. Furthermore, the three algorithms are analyzed in terms of iterations until convergence to the optimal behavior, memory usage, and runtime. Within the paper, NNs are utilized for approximation in the classical domain, while in the quantum domain variational quantum circuits, as a quantum hybrid approximation method, have been used. Our simulations show that a quantum approximator is beneficial in terms of memory usage and provides a better sample complexity than NNs; however, it still lacks the computational speed to be competitive.

Download Full-text

Indirect Multi-energy Transactions of Energy Internet with Deep Reinforcement Learning Approach

IEEE Transactions on Power Systems ◽

10.1109/tpwrs.2022.3142969 ◽

2022 ◽

pp. 1-1

Author(s):

Lingxiao Yang ◽

Qiuye Sun ◽

Ning Zhang ◽

Yushuai Li

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Energy Internet

Download Full-text

Stability Control of a Biped Robot on a Dynamic Platform Based on Hybrid Reinforcement Learning

Sensors ◽

10.3390/s20164468 ◽

2020 ◽

Vol 20 (16) ◽

pp. 4468

Author(s):

Ao Xi ◽

Chao Chen

Keyword(s):

Reinforcement Learning ◽

Center Of Pressure ◽

Stable State ◽

Biped Robot ◽

Action Space ◽

Training Procedure ◽

Joint Angles ◽

Model Free ◽

Initial Control ◽

Hybrid Reinforcement

In this work, we introduced a novel hybrid reinforcement learning scheme to balance a biped robot (NAO) on an oscillating platform, where the rotation of the platform is considered as the external disturbance to the robot. The platform had two degrees of freedom in rotation, pitch and roll. The state space comprised the position of center of pressure, and joint angles and joint velocities of two legs. The action space consisted of the joint angles of ankles, knees, and hips. By adding the inverse kinematics techniques, the dimension of action space was significantly reduced. Then, a model-based system estimator was employed during the offline training procedure to estimate the dynamics model of the system by using novel hierarchical Gaussian processes, and to provide initial control inputs, after which the reduced action space of each joint was obtained by minimizing the cost of reaching the desired stable state. Finally, a model-free optimizer based on DQN (λ) was introduced to fine tune the initial control inputs, where the optimal control inputs were obtained for each joint at any state. The proposed reinforcement learning not only successfully avoided the distribution mismatch problem, but also improved the sample efficiency. Simulation results showed that the proposed hybrid reinforcement learning mechanism enabled the NAO robot to balance on an oscillating platform with different frequencies and magnitudes. Both control performance and robustness were guaranteed during the experiments.

Download Full-text