scholarly journals Adaptive Volt–Var Control in Smart PV Inverter for Mitigating Voltage Unbalance at PCC Using Multiagent Deep Reinforcement Learning

2021 ◽  
Vol 11 (19) ◽  
pp. 8979
Author(s):  
Yoongun Jung ◽  
Changhee Han ◽  
Dongwon Lee ◽  
Sungyoon Song ◽  
Gilsoo Jang

Modern distribution networks face an increasing number of challenges in maintaining balanced grid voltages because of the rapid increase in single-phase distributed generators. Because of the proliferation of inverter-based resources, such as photovoltaic (PV) resources, in distribution networks, a novel method is proposed for mitigating voltage unbalance at the point of common coupling by tuning the volt–var curve of each PV inverter through a day-ahead deep reinforcement learning training platform with forecast data in a digital twin grid. The proposed strategy uses proximal policy optimization, which can effectively search for a global optimal solution. Deep reinforcement learning has a major advantage in that the calculation time required to derive an optimal action in the smart inverter can be significantly reduced. In the proposed framework, multiple agents with multiple inverters require information on the load consumption and active power output of each PV inverter. The results demonstrate the effectiveness of the proposed control strategy on the modified IEEE 13 standard bus systems with time-varying load and PV profiles. A comparison of the effect on voltage unbalance mitigation shows that the proposed inverter can address voltage unbalance issues more efficiently than a fixed droop inverter.

Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 176
Author(s):  
Federico Molina-Martin ◽  
Oscar Danilo Montoya ◽  
Luis Fernando Grisales-Noreña ◽  
Jesus C. Hernández

The problem of the optimal placement and dimensioning of constant power sources (i.e., distributed generators) in electrical direct current (DC) distribution networks has been addressed in this research from the point of view of convex optimization. The original mixed-integer nonlinear programming (MINLP) model has been transformed into a mixed-integer conic equivalent via second-order cone programming, which produces a MI-SOCP approximation. The main advantage of the proposed MI-SOCP model is the possibility of ensuring global optimum finding using a combination of the branch and bound method to address the integer part of the problem (i.e., the location of the power sources) and the interior-point method to solve the dimensioning problem. Numerical results in the 21- and 69-node test feeders demonstrated its efficiency and robustness compared to an exact MINLP method available in GAMS: in the case of the 69-node test feeders, the exact MINLP solvers are stuck in local optimal solutions, while the proposed MI-SOCP model enables the finding of the global optimal solution. Additional simulations with daily load curves and photovoltaic sources confirmed the effectiveness of the proposed MI-SOCP methodology in locating and sizing distributed generators in DC grids; it also had low processing times since the location of three photovoltaic sources only requires 233.16s, which is 3.7 times faster than the time required by the SOCP model in the absence of power sources.


2020 ◽  
Vol 10 (23) ◽  
pp. 8616 ◽  
Author(s):  
Oscar Danilo Montoya ◽  
Walter Gil-González ◽  
Luis Fernando Grisales-Noreña

This research addresses the problem of the optimal location and sizing distributed generators (DGs) in direct current (DC) distribution networks from the combinatorial optimization. It is proposed a master–slave optimization approach in order to solve the problems of placement and location of DGs, respectively. The master stage applies to the classical Chu & Beasley genetic algorithm (GA), while the slave stage resolves a second-order cone programming reformulation of the optimal power flow problem for DC grids. This master–slave approach generates a hybrid optimization approach, named GA-SOCP. The main advantage of optimal dimensioning of DGs via SOCP is that this method makes part of the exact mathematical optimization that guarantees the possibility of finding the global optimal solution due to the solution space’s convex structure, which is a clear improvement regarding classical metaheuristic optimization methodologies. Numerical comparisons with hybrid and exact optimization approaches reported in the literature demonstrate the proposed hybrid GA-SOCP approach’s effectiveness and robustness to achieve the global optimal solution. Two test feeders compose of 21 and 69 nodes that can locate three distributed generators are considered. All of the computational validations have been carried out in the MATLAB software and the CVX tool for convex optimization.


Electronics ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 26 ◽  
Author(s):  
Oscar Danilo Montoya ◽  
Alexander Molina-Cabrera ◽  
Harold R. Chamorro ◽  
Lazaro Alvarado-Barrios ◽  
Edwin Rivas-Trujillo

This paper deals with the problem of the optimal placement and sizing of distributed generators (DGs) in alternating current (AC) distribution networks by proposing a hybrid master–slave optimization procedure. In the master stage, the discrete version of the sine–cosine algorithm (SCA) determines the optimal location of the DGs, i.e., the nodes where these must be located, by using an integer codification. In the slave stage, the problem of the optimal sizing of the DGs is solved through the implementation of the second-order cone programming (SOCP) equivalent model to obtain solutions for the resulting optimal power flow problem. As the main advantage, the proposed approach allows converting the original mixed-integer nonlinear programming formulation into a mixed-integer SOCP equivalent. That is, each combination of nodes provided by the master level SCA algorithm to locate distributed generators brings an optimal solution in terms of its sizing; since SOCP is a convex optimization model that ensures the global optimum finding. Numerical validations of the proposed hybrid SCA-SOCP to optimal placement and sizing of DGs in AC distribution networks show its capacity to find global optimal solutions. Some classical distribution networks (33 and 69 nodes) were tested, and some comparisons were made using reported results from literature. In addition, simulation cases with unity and variable power factor are made, including the possibility of locating photovoltaic sources considering daily load and generation curves. All the simulations were carried out in the MATLAB software using the CVX optimization tool.


2012 ◽  
Vol 5 (3) ◽  
pp. 16-22
Author(s):  
P. Harshavardhan Reddy ◽  
◽  
J.N. Chandra Sekhar ◽  
M. Padma Lalitha ◽  
◽  
...  

Author(s):  
Anastasios C. Papachristou ◽  
Ahmed S. A. Awad ◽  
Dave Turcotte ◽  
Steven Wong ◽  
Alexandre Prieur

2021 ◽  
Author(s):  
Srivatsan Krishnan ◽  
Behzad Boroujerdian ◽  
William Fu ◽  
Aleksandra Faust ◽  
Vijay Janapa Reddi

AbstractWe introduce Air Learning, an open-source simulator, and a gym environment for deep reinforcement learning research on resource-constrained aerial robots. Equipped with domain randomization, Air Learning exposes a UAV agent to a diverse set of challenging scenarios. We seed the toolset with point-to-point obstacle avoidance tasks in three different environments and Deep Q Networks (DQN) and Proximal Policy Optimization (PPO) trainers. Air Learning assesses the policies’ performance under various quality-of-flight (QoF) metrics, such as the energy consumed, endurance, and the average trajectory length, on resource-constrained embedded platforms like a Raspberry Pi. We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to $$40\%$$ 40 % longer trajectories in one of the environments. To understand the source of such discrepancies, we use Air Learning to artificially degrade high-end desktop performance to mimic what happens on a low-end embedded system. We then propose a mitigation technique that uses the hardware-in-the-loop to determine the latency distribution of running the policy on the target platform (onboard compute on aerial robot). A randomly sampled latency from the latency distribution is then added as an artificial delay within the training loop. Training the policy with artificial delays allows us to minimize the hardware gap (discrepancy in the flight time metric reduced from 37.73% to 0.5%). Thus, Air Learning with hardware-in-the-loop characterizes those differences and exposes how the onboard compute’s choice affects the aerial robot’s performance. We also conduct reliability studies to assess the effect of sensor failures on the learned policies. All put together, Air Learning enables a broad class of deep RL research on UAVs. The source code is available at: https://github.com/harvard-edge/AirLearning.


2021 ◽  
Vol 11 (4) ◽  
pp. 1514 ◽  
Author(s):  
Quang-Duy Tran ◽  
Sang-Hoon Bae

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Pallavi Bagga ◽  
Nicola Paoletti ◽  
Bedour Alrayes ◽  
Kostas Stathis

AbstractWe present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Abu Quwsar Ohi ◽  
M. F. Mridha ◽  
Muhammad Mostafa Monowar ◽  
Md. Abdul Hamid

AbstractPandemic defines the global outbreak of a disease having a high transmission rate. The impact of a pandemic situation can be lessened by restricting the movement of the mass. However, one of its concomitant circumstances is an economic crisis. In this article, we demonstrate what actions an agent (trained using reinforcement learning) may take in different possible scenarios of a pandemic depending on the spread of disease and economic factors. To train the agent, we design a virtual pandemic scenario closely related to the present COVID-19 crisis. Then, we apply reinforcement learning, a branch of artificial intelligence, that deals with how an individual (human/machine) should interact on an environment (real/virtual) to achieve the cherished goal. Finally, we demonstrate what optimal actions the agent perform to reduce the spread of disease while considering the economic factors. In our experiment, we let the agent find an optimal solution without providing any prior knowledge. After training, we observed that the agent places a long length lockdown to reduce the first surge of a disease. Furthermore, the agent places a combination of cyclic lockdowns and short length lockdowns to halt the resurgence of the disease. Analyzing the agent’s performed actions, we discover that the agent decides movement restrictions not only based on the number of the infectious population but also considering the reproduction rate of the disease. The estimation and policy of the agent may improve the human-strategy of placing lockdown so that an economic crisis may be avoided while mitigating an infectious disease.


Sign in / Sign up

Export Citation Format

Share Document