scholarly journals A Case Study on Air Combat Decision Using Approximated Dynamic Programming

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Yaofei Ma ◽  
Xiaole Ma ◽  
Xiao Song

As a continuous state space problem, air combat is difficult to be resolved by traditional dynamic programming (DP) with discretized state space. The approximated dynamic programming (ADP) approach is studied in this paper to build a high performance decision model for air combat in 1 versus 1 scenario, in which the iterative process for policy improvement is replaced by mass sampling from history trajectories and utility function approximating, leading to high efficiency on policy improvement eventually. A continuous reward function is also constructed to better guide the plane to find its way to “winner” state from any initial situation. According to our experiments, the plane is more offensive when following policy derived from ADP approach other than the baseline Min-Max policy, in which the “time to win” is reduced greatly but the cumulated probability of being killed by enemy is higher. The reason is analyzed in this paper.

Author(s):  
Xiong Wang ◽  
Riheng Jia

Mean field game facilitates analyzing multi-armed bandit (MAB) for a large number of agents by approximating their interactions with an average effect. Existing mean field models for multi-agent MAB mostly assume a binary reward function, which leads to tractable analysis but is usually not applicable in practical scenarios. In this paper, we study the mean field bandit game with a continuous reward function. Specifically, we focus on deriving the existence and uniqueness of mean field equilibrium (MFE), thereby guaranteeing the asymptotic stability of the multi-agent system. To accommodate the continuous reward function, we encode the learned reward into an agent state, which is in turn mapped to its stochastic arm playing policy and updated using realized observations. We show that the state evolution is upper semi-continuous, based on which the existence of MFE is obtained. As the Markov analysis is mainly for the case of discrete state, we transform the stochastic continuous state evolution into a deterministic ordinary differential equation (ODE). On this basis, we can characterize a contraction mapping for the ODE to ensure a unique MFE for the bandit game. Extensive evaluations validate our MFE characterization, and exhibit tight empirical regret of the MAB problem.


2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Xi-liang Chen ◽  
Lei Cao ◽  
Zhi-xiong Xu ◽  
Jun Lai ◽  
Chen-xi Li

The assumption of IRL is that demonstrations are optimally acting in an environment. In the past, most of the work on IRL needed to calculate optimal policies for different reward functions. However, this requirement is difficult to satisfy in large or continuous state space tasks. Let alone continuous action space. We propose a continuous maximum entropy deep inverse reinforcement learning algorithm for continuous state space and continues action space, which realizes the depth cognition of the environment model by the way of reconstructing the reward function based on the demonstrations, and a hot start mechanism based on demonstrations to make the training process faster and better. We compare this new approach to well-known IRL algorithms using Maximum Entropy IRL, DDPG, hot start DDPG, etc. Empirical results on classical control environments on OpenAI Gym: MountainCarContinues-v0 show that our approach is able to learn policies faster and better.


Author(s):  
Yanan Zhou ◽  
Yaofei Ma ◽  
Xiao Song ◽  
Guanghong Gong

Value function approximation plays an important role in reinforcement learning (RL) with continuous state space, which is widely used to build decision models in practice. Many traditional approaches require experienced designers to manually specify the formulization of the approximating function, leading to the rigid, non-adaptive representation of the value function. To address this problem, a novel Q-value function approximation method named ‘Hierarchical fuzzy Adaptive Resonance Theory’ (HiART) is proposed in this paper. HiART is based on the Fuzzy ART method and is an adaptive classification network that learns to segment the state space by classifying the training input automatically. HiART begins with a highly generalized structure where the number of the category nodes is limited, which is beneficial to speed up the learning process at the early stage. Then, the network is refined gradually by creating the attached sub-networks, and a layered network structure is formed during this process. Based on this adaptive structure, HiART alleviates the dependence on expert experience to design the network parameter. The effectiveness and adaptivity of HiART are demonstrated in the Mountain Car benchmark problem with both fast learning speed and low computation time. Finally, a simulation application example of the one versus one air combat decision problem illustrates the applicability of HiART.


2016 ◽  
Vol 11 (9) ◽  
pp. 764
Author(s):  
Lella Aicha Ayadi ◽  
Nihel Neji ◽  
Hassen Loukil ◽  
Mouhamed Ali Ben Ayed ◽  
Nouri Masmoudi

Processes ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 1034
Author(s):  
Ching-Chien Huang ◽  
Chin-Chieh Mo ◽  
Guan-Ming Chen ◽  
Hsiao-Hsuan Hsu ◽  
Guo-Jiun Shu

In this work, an experiment was carried out to investigate the preparation condition of anisotropic, Fe-deficient, M-type Sr ferrite with optimum magnetic and physical properties by changing experimental parameters, such as the La substitution amount and little additive modification during fine milling process. The compositions of the calcined ferrites were chosen according to the stoichiometry LaxSr1-xFe12-2xO19, where M-type single-phase calcined powder was synthesized with a composition of x = 0.30. The effect of CaCO3, SiO2, and Co3O4 inter-additives on the Sr ferrite was also discussed in order to obtain low-temperature sintered magnets. The magnetic properties of Br = 4608 Gauss, bHc = 3650 Oe, iHc = 3765 Oe, and (BH)max = 5.23 MGOe were obtained for Sr ferrite hard magnets with low cobalt content at 1.7 wt%, which will eventually be used as high-end permanent magnets for the high-efficiency motor application in automobiles with Br > 4600 ± 50 G and iHc > 3600 ± 50 Oe.


2021 ◽  
Vol 12 (11) ◽  
pp. 1692-1699
Author(s):  
Ji Hye Lee ◽  
Jinhyo Hwang ◽  
Chai Won Kim ◽  
Amit Kumar Harit ◽  
Han Young Woo ◽  
...  

New polystyrene-based polymers with high π-extended hole transport pendants were synthesized to obtain a low turn-on voltage and high efficiency in solution-processed green TADF-OLEDs.


Energies ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 3716
Author(s):  
Francesco Causone ◽  
Rossano Scoccia ◽  
Martina Pelle ◽  
Paola Colombo ◽  
Mario Motta ◽  
...  

Cities and nations worldwide are pledging to energy and carbon neutral objectives that imply a huge contribution from buildings. High-performance targets, either zero energy or zero carbon, are typically difficult to be reached by single buildings, but groups of properly-managed buildings might reach these ambitious goals. For this purpose we need tools and experiences to model, monitor, manage and optimize buildings and their neighborhood-level systems. The paper describes the activities pursued for the deployment of an advanced energy management system for a multi-carrier energy grid of an existing neighborhood in the area of Milan. The activities included: (i) development of a detailed monitoring plan, (ii) deployment of the monitoring plan, (iii) development of a virtual model of the neighborhood and simulation of the energy performance. Comparisons against early-stage energy monitoring data proved promising and the generation system showed high efficiency (EER equal to 5.84), to be further exploited.


2021 ◽  
Vol 7 (10) ◽  
pp. eabe8130
Author(s):  
Shangshang Chen ◽  
Xun Xiao ◽  
Hangyu Gu ◽  
Jinsong Huang

Perovskite-based electronic materials and devices such as perovskite solar cells (PSCs) have notoriously bad reproducibility, which greatly impedes both fundamental understanding of their intrinsic properties and real-world applications. Here, we report that organic iodide perovskite precursors can be oxidized to I2 even for carefully sealed precursor powders or solutions, which markedly deteriorates the performance and reproducibility of PSCs. Adding benzylhydrazine hydrochloride (BHC) as a reductant into degraded precursor solutions can effectively reduce the detrimental I2 back to I−, accompanied by a substantial reduction of I3−-induced charge traps in the films. BHC residuals in perovskite films further stabilize the PSCs under operation conditions. BHC improves the stabilized efficiency of the blade-coated p-i-n structure PSCs to a record value of 23.2% (22.62 ± 0.40% certified by National Renewable Energy Laboratory), and the high-efficiency devices have a very high yield. A stabilized aperture efficiency of 18.2% is also achieved on a 35.8-cm2 mini-module.


Sign in / Sign up

Export Citation Format

Share Document