Approximate Dynamic Programming for Military Medical Evacuation Dispatching Policies

INFORMS Journal on Computing ◽

10.1287/ijoc.2019.0930 ◽

2020 ◽

Author(s):

Phillip R. Jenkins ◽

Matthew J. Robbins ◽

Brian J. Lunday

Keyword(s):

Dynamic Programming ◽

High Intensity ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Problem Instance ◽

Classical Dynamic ◽

Medical Evacuation ◽

Solution Methods ◽

Problem Instances ◽

Solution Techniques

Military medical planners must consider how aerial medical evacuation (MEDEVAC) assets will be dispatched when preparing for and supporting high-intensity combat operations. The dispatching authority seeks to dispatch MEDEVAC assets to prioritized requests for service, such that battlefield casualties are effectively and efficiently transported to nearby medical-treatment facilities. We formulate and solve a discounted, infinite-horizon Markov decision process (MDP) model of the MEDEVAC dispatching problem. Because the high dimensionality and uncountable state space of our MDP model renders classical dynamic programming solution methods intractable, we instead apply approximate dynamic programming (ADP) solution methods to produce high-quality dispatching policies relative to the currently practiced closest-available dispatching policy. We develop, test, and compare two distinct ADP solution techniques, both of which utilize an approximate policy iteration (API) algorithmic framework. The first algorithm uses least-squares temporal differences (LSTD) learning for policy evaluation, whereas the second algorithm uses neural network (NN) learning. We construct a notional, yet representative planning scenario based on high-intensity combat operations in southern Azerbaijan to demonstrate the applicability of our MDP model and to compare the efficacies of our proposed ADP solution techniques. We generate 30 problem instances via a designed experiment to examine how selected problem features and algorithmic features affect the quality of solutions attained by our ADP policies. Results show that the respective policies determined by the NN-API and LSTD-API algorithms significantly outperform the closest-available benchmark policies in 27 (90%) and 24 (80%) of the problem instances examined. Moreover, the NN-API policies significantly outperform the LSTD-API policies in each of the problem instances examined. Compared with the closest-available policy for the baseline problem instance, the NN-API policy decreases the average response time of important urgent (i.e., life-threatening) requests by 39 minutes. These research models, methodologies, and results inform the implementation and modification of current and future MEDEVAC tactics, techniques, and procedures, as well as the design and purchase of future aerial MEDEVAC assets.

Download Full-text

Approximate dynamic programming for the dispatch of military medical evacuation assets

European Journal of Operational Research ◽

10.1016/j.ejor.2016.04.017 ◽

2016 ◽

Vol 254 (3) ◽

pp. 824-839 ◽

Cited By ~ 15

Author(s):

Aaron J. Rettke ◽

Matthew J. Robbins ◽

Brian J. Lunday

Keyword(s):

Dynamic Programming ◽

Approximate Dynamic Programming ◽

Medical Evacuation

Download Full-text

Optimal Switching in Anti-Lock Brake Systems of Ground Vehicles Based on Approximate Dynamic Programming

Volume 3: Multiagent Network Systems; Natural Gas and Heat Exchangers; Path Planning and Motion Control; Powertrain Systems; Rehab Robotics; Robot Manipulators; Rollover Prevention (AVS); Sensors and Actuators; Time Delay Systems; Tracking Control Systems; Uncertain Systems and Robustness; Unmanned, Ground and Surface Robotics; Vehicle Dynamics Control; Vibration and Control of Smart Structures/Mech Systems; Vibration Issues in Mechanical Systems ◽

10.1115/dscc2015-9893 ◽

2015 ◽

Cited By ~ 2

Author(s):

Tohid Sardarmehni ◽

Ali Heydari

Keyword(s):

Dynamic Programming ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Brake System ◽

Iteration Algorithm ◽

Value Iteration ◽

Ground Vehicles ◽

Optimal Switching ◽

Hydraulic Brake ◽

And Control

Approximate dynamic programming, also known as reinforcement learning, is applied for optimal control of Antilock Brake Systems (ABS) in ground vehicles. As an accurate and control oriented model of the brake system, quarter vehicle model with hydraulic brake system is selected. Due to the switching nature of hydraulic brake system of ABS, an optimal switching solution is generated through minimizing a performance index that penalizes the braking distance and forces the vehicle velocity to go to zero, while preventing wheel lock-ups. Towards this objective, a value iteration algorithm is selected for ‘learning’ the infinite horizon solution. Artificial neural networks, as powerful function approximators, are utilized for approximating the value function. The training is conducted offline using least squares. Once trained, the converged neural network is used for determining optimal decisions for the actuators on the fly. Numerical simulations show that this approach is very promising while having low real-time computational burden, hence, outperforms many existing solutions in the literature.

Download Full-text

APPROXIMATE DYNAMIC PROGRAMMING TECHNIQUES FOR SKILL-BASED ROUTING IN CALL CENTERS

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964812000216 ◽

2012 ◽

Vol 26 (4) ◽

pp. 581-591 ◽

Cited By ~ 5

Author(s):

D. Roubos ◽

S. Bhulai

Keyword(s):

Dynamic Programming ◽

Call Center ◽

Approximate Dynamic Programming ◽

Call Centers ◽

Problem Instance ◽

Value Functions ◽

Skill Sets ◽

Programming Techniques ◽

Optimal Dynamic ◽

Routing Policies

We consider the problem of dynamic multi-skill routing in call centers. Calls from different customer classes are offered to the call center according to a Poisson process. The agents are grouped into pools according to their heterogeneous skill sets that determine the calls that they can handle. Each pool of agents serves calls with independent exponentially distributed service times. Arriving calls that cannot be served directly are placed in a buffer that is dedicated to the customer class. We obtain nearly optimal dynamic routing policies that are scalable with the problem instance and can be computed online. The algorithm is based on approximate dynamic programming techniques. In particular, we perform one-step policy improvement using a polynomial approximation to relative value functions. We compare the performance of this method with decomposition techniques. Numerical experiments demonstrate that our method outperforms leading routing policies and has close to optimal performance.

Download Full-text

Infinite horizon optimal control of affine nonlinear discrete switched systems using two-stage approximate dynamic programming

International Journal of Systems Science ◽

10.1080/00207721.2010.549590 ◽

2012 ◽

Vol 43 (9) ◽

pp. 1673-1682 ◽

Cited By ~ 17

Author(s):

Ning Cao ◽

Huaguang Zhang ◽

Yanhong Luo ◽

Dezhi Feng

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Switched Systems ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Two Stage ◽

Discrete Switched Systems

Download Full-text

Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization

Computers & Operations Research ◽

10.1016/j.cor.2020.105032 ◽

2020 ◽

Vol 124 ◽

pp. 105032

Author(s):

Ying Chen ◽

Feng Liu ◽

Jay M. Rosenberger ◽

Victoria C.P. Chen ◽

Asama Kulvanitchaiyanunt ◽

...

Keyword(s):

Dynamic Programming ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Computer Experiments ◽

Infinite Horizon Optimization

Download Full-text

Optimization for Large-Scale Multi-Mission Space Campaign Design by Approximate Dynamic Programming

2018 AIAA SPACE and Astronautics Forum and Exposition ◽

10.2514/6.2018-5287 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hao Chen ◽

Arthur Lapin ◽

Takaya Ukai ◽

Chao Lei ◽

Koki Ho

Keyword(s):

Dynamic Programming ◽

Large Scale ◽

Approximate Dynamic Programming

Download Full-text

Undiscounted Control Policy Generation for Continuous-Valued Optimal Control by Approximate Dynamic Programming

International Journal of Control ◽

10.1080/00207179.2021.1939892 ◽

2021 ◽

pp. 1-35

Author(s):

Jonathan Lock ◽

Tomas McKelvey

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Approximate Dynamic Programming ◽

Control Policy

Download Full-text

Large-scale dynamic system optimization using dual decomposition method with approximate dynamic programming

Systems & Control Letters ◽

10.1016/j.sysconle.2021.104894 ◽

2021 ◽

Vol 150 ◽

pp. 104894

Author(s):

Pegah Rokhforoz ◽

Hamed Kebriaei ◽

Majid Nili Ahmadabadi

Keyword(s):

Dynamic Programming ◽

Dynamic System ◽

Decomposition Method ◽

Large Scale ◽

Approximate Dynamic Programming ◽

System Optimization ◽

Dual Decomposition

Download Full-text

Using Machine Learning for Quantum Annealing Accuracy Prediction

Algorithms ◽

10.3390/a14060187 ◽

2021 ◽

Vol 14 (6) ◽

pp. 187

Author(s):

Aaron Barbosa ◽

Elijah Pelofske ◽

Georg Hahn ◽

Hristo N. Djidjev

Keyword(s):

Machine Learning ◽

Maximum Clique ◽

Classification Model ◽

Maximum Clique Problem ◽

Problem Instance ◽

Np Hard ◽

Machine Learning Classification ◽

Hard Problems ◽

Problem Instances ◽

D Wave

Quantum annealers, such as the device built by D-Wave Systems, Inc., offer a way to compute solutions of NP-hard problems that can be expressed in Ising or quadratic unconstrained binary optimization (QUBO) form. Although such solutions are typically of very high quality, problem instances are usually not solved to optimality due to imperfections of the current generations quantum annealers. In this contribution, we aim to understand some of the factors contributing to the hardness of a problem instance, and to use machine learning models to predict the accuracy of the D-Wave 2000Q annealer for solving specific problems. We focus on the maximum clique problem, a classic NP-hard problem with important applications in network analysis, bioinformatics, and computational chemistry. By training a machine learning classification model on basic problem characteristics such as the number of edges in the graph, or annealing parameters, such as the D-Wave’s chain strength, we are able to rank certain features in the order of their contribution to the solution hardness, and present a simple decision tree which allows to predict whether a problem will be solvable to optimality with the D-Wave 2000Q. We extend these results by training a machine learning regression model that predicts the clique size found by D-Wave.

Download Full-text

Control of a Buck DC/DC Converter Using Approximate Dynamic Programming and Artificial Neural Networks

IEEE Transactions on Circuits and Systems I Regular Papers ◽

10.1109/tcsi.2021.3053468 ◽

2021 ◽

pp. 1-9

Author(s):

Weizhen Dong ◽

Shuhui Li ◽

Xingang Fu ◽

Zhongwen Li ◽

Michael Fairbank ◽

...

Keyword(s):

Neural Networks ◽

Dynamic Programming ◽

Artificial Neural Networks ◽

Approximate Dynamic Programming ◽

Artificial Neural

Download Full-text