scholarly journals Adaptive aggregation methods for infinite horizon dynamic programming

1989 ◽  
Vol 34 (6) ◽  
pp. 589-598 ◽  
Author(s):  
D.P. Bertsekas ◽  
D.A. Castanon
Author(s):  
Tohid Sardarmehni ◽  
Ali Heydari

Approximate dynamic programming, also known as reinforcement learning, is applied for optimal control of Antilock Brake Systems (ABS) in ground vehicles. As an accurate and control oriented model of the brake system, quarter vehicle model with hydraulic brake system is selected. Due to the switching nature of hydraulic brake system of ABS, an optimal switching solution is generated through minimizing a performance index that penalizes the braking distance and forces the vehicle velocity to go to zero, while preventing wheel lock-ups. Towards this objective, a value iteration algorithm is selected for ‘learning’ the infinite horizon solution. Artificial neural networks, as powerful function approximators, are utilized for approximating the value function. The training is conducted offline using least squares. Once trained, the converged neural network is used for determining optimal decisions for the actuators on the fly. Numerical simulations show that this approach is very promising while having low real-time computational burden, hence, outperforms many existing solutions in the literature.


Author(s):  
Hans Fehr ◽  
Fabian Kindermann

In this chapter we apply the principles of dynamic programming to some standard macroeconomic models. For now we stay in the world of infinite horizon models, which are characterized by the fact that they are populated by one or several households with an infinite planning horizon, similar to the previous chapter. There are several justifications for such an assumption. Beneath simplicity, altruism is probably the most famous argument in favour of infinite horizon models. Assume that in a period t there is one generation that dies with certainty after this period.The utility of this generation from its own consumption is u(·). Yet, each generation is altruistic towards its descendants. Consequently, total utility of the generation is Ut = u(·) + βUt+1 where β ≤ 1 can be interpreted as the degree of altruism. All generations together then form a dynasty.


Sign in / Sign up

Export Citation Format

Share Document