Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory

2006 ◽  
Vol 129 (2) ◽  
pp. 215-225 ◽  
Author(s):  
Simeng Liu ◽  
Gregor P. Henze

This paper describes an investigation of machine learning for supervisory control of active and passive thermal storage capacity in buildings. Previous studies show that the utilization of active or passive thermal storage, or both, can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant based on the reinforcement feedback (monetary cost of each action, in this study) it receives for past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and thermal energy storage charging∕discharging rate. The controller extracts information about the environment based solely on the reinforcement signal; the controller does not contain a predictive or system model. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The present analysis shows that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. It is found that it takes a long time to learn control strategies for tasks associated with large state and action spaces.

Author(s):  
Simeng Liu ◽  
Gregor P. Henze

This paper describes an investigation of machine-learning control for the supervisory control of building active and passive thermal storage inventory. Previous studies show that the utilization of either active or passive, or both can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant optimally based on the feedback it receives from past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and TES charging/discharging rate. The controller extracts cues about the environment solely based on the reinforcement feedback it receives, which in this study is the monetary cost of each control action. No prediction or system model is required. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. This presented analysis revealed that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. Moreover learning speed proved to be relatively low when dealing with tasks associated with large state and action spaces.


Author(s):  
Ernst Moritz Hahn ◽  
Mateo Perez ◽  
Sven Schewe ◽  
Fabio Somenzi ◽  
Ashutosh Trivedi ◽  
...  

AbstractWe study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.


2013 ◽  
Vol 671-674 ◽  
pp. 2515-2519
Author(s):  
Xue Mei Wang ◽  
Zhen Hai Wang ◽  
Xing Long Wu

This project aims to study the optimal control model of the ice-storage system which is theoretically close to the optimal control and also applicable to actual engineering. Using Energy Plus, the energy consumption simulation software, and the simple solution method of optimal control, researchers can analyze and compare the annual operation costs of the ice-storage air-conditioning system of a project in Beijing under different control strategies. Researchers obtained the power rates of the air-conditioning system in the office building under the conditions of chiller-priority and optimal contro1 throughout the cooling season. Through analysis and comparison, they find that after the implementation of optimal control, the annually saved power bills mainly result from non-design conditions, especially in the transitional seasons.


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Shuo Zhang ◽  
Chengning Zhang ◽  
Guangwei Han ◽  
Qinghui Wang

A dual-motor coupling-propulsion electric bus (DMCPEB) is modeled, and its optimal control strategy is studied in this paper. The necessary dynamic features of energy loss for subsystems is modeled. Dynamic programming (DP) technique is applied to find the optimal control strategy including upshift threshold, downshift threshold, and power split ratio between the main motor and auxiliary motor. Improved control rules are extracted from the DP-based control solution, forming near-optimal control strategies. Simulation results demonstrate that a significant improvement in reducing energy loss due to the dual-motor coupling-propulsion system (DMCPS) running is realized without increasing the frequency of the mode switch.


Author(s):  
Philip Odonkor ◽  
Kemper Lewis

Abstract In the wake of increasing proliferation of renewable energy and distributed energy resources (DERs), grid designers and operators alike are faced with several emerging challenges in curbing allocative grid inefficiencies and maintaining operational stability. One such challenge relates to the increased price volatility within real-time electricity markets, a result of the inherent intermittency of renewable energy. With this challenge, however, comes heightened economic interest in exploiting the arbitrage potential of price volatility towards demand-side energy cost savings. To this end, this paper aims to maximize the arbitrage value of electricity through the optimal design of control strategies for DERs. Formulated as an arbitrage maximization problem using design optimization, and solved using reinforcement learning, the proposed approach is applied towards shared DERs within multi-building residential clusters. We demonstrate its feasibility across three unique building cluster demand profiles, observing notable energy cost reductions over baseline values. This highlights a capability for generalized learning across multiple building clusters and the ability to design efficient arbitrage policies towards energy cost minimization. Finally, the approach is shown to be computationally tractable, designing efficient strategies in approximately 5 hours of training over a simulation time horizon of 1 month.


Author(s):  
Ilan Zohar ◽  
Amit Ailon

This paper presents a simple approach for solving optimal control problems in wheeled mobile robots with bounded inputs. The control objective is to minimize a quadratic index of performance subject to differential constraints (the mobile robot equations of motion). The solution to the problem is obtained by utilizing an explicit trajectory parametrization method, which allows us to establish a sub-optimal control strategy by minimizing a multivariable function subject to a set of algebraic constraints. The approach is based on the flatness property, which allows us to represent the flat output by a polynomial. The bounds on the input signals are taken into consideration in the current analysis.


Author(s):  
Atokolo William ◽  
Akpa Johnson ◽  
Daniel Musa Alih ◽  
Olayemi Kehinde Samuel ◽  
C. E. Mbah Godwin

This work is aimed at formulating a mathematical model for the control of zika virus infection using Sterile Insect Technology (SIT). The model is extended to incorporate optimal control strategy by introducing three control measures. The optimal control is aimed at minimizing the number of Exposed human, Infected human and the total number of Mosquitoes in a population and as such reducing contacts between mosquitoes and human, human to human and above all, eliminates the population of Mosquitoes. The Pontryagin’s maximum principle was used to obtain the necessary conditions, find the optimality system of our model and to obtain solution to the control problem. Numerical simulations result shows that; reduction in the number of Exposed human population, Infected human population and reduction in the entire population of Mosquito population is best achieved using the optimal control strategy.


Sign in / Sign up

Export Citation Format

Share Document