constrained markov decision process Latest Research Papers

A UoI-Optimal Policy for Timely Status Updates with Resource Constraint

Entropy ◽

10.3390/e23081084 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1084

Author(s):

Lehan Wang ◽

Jingzhou Sun ◽

Yuxuan Sun ◽

Sheng Zhou ◽

Zhisheng Niu

Keyword(s):

Optimal Policy ◽

Estimation Error ◽

Autonomous Driving ◽

Resource Constraint ◽

Context Aware ◽

Scheduling Policy ◽

Industrial Internet ◽

Markov Decision ◽

Advanced Knowledge ◽

Constrained Markov Decision Process

Timely status updates are critical in remote control systems such as autonomous driving and the industrial Internet of Things, where timeliness requirements are usually context dependent. Accordingly, the Urgency of Information (UoI) has been proposed beyond the well-known Age of Information (AoI) by further including context-aware weights which indicate whether the monitored process is in an emergency. However, the optimal updating and scheduling strategies in terms of UoI remain open. In this paper, we propose a UoI-optimal updating policy for timely status information with resource constraint. We first formulate the problem in a constrained Markov decision process and prove that the UoI-optimal policy has a threshold structure. When the context-aware weights are known, we propose a numerical method based on linear programming. When the weights are unknown, we further design a reinforcement learning (RL)-based scheduling policy. The simulation reveals that the threshold of the UoI-optimal policy increases as the resource constraint tightens. In addition, the UoI-optimal policy outperforms the AoI-optimal policy in terms of average squared estimation error, and the proposed RL-based updating policy achieves a near-optimal performance without the advanced knowledge of the system model.

Download Full-text

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/614 ◽

2021 ◽

Author(s):

Yongshuai Liu ◽

Avishai Halev ◽

Xin Liu

Keyword(s):

Reinforcement Learning ◽

Future Research ◽

Physical Systems ◽

Policy Performance ◽

Model Free ◽

Open Questions ◽

Markov Decision ◽

Pros And Cons ◽

Constrained Markov Decision Process ◽

Limit Resource

Reinforcement Learning (RL) algorithms have had tremendous success in simulated domains. These algorithms, however, often cannot be directly applied to physical systems, especially in cases where there are constraints to satisfy (e.g. to ensure safety or limit resource consumption). In standard RL, the agent is incentivized to explore any policy with the sole goal of maximizing reward; in the real world, however, ensuring satisfaction of certain constraints in the process is also necessary and essential. In this article, we overview existing approaches addressing constraints in model-free reinforcement learning. We model the problem of learning with constraints as a Constrained Markov Decision Process and consider two main types of constraints: cumulative and instantaneous. We summarize existing approaches and discuss their pros and cons. To evaluate policy performance under constraints, we introduce a set of standard benchmarks and metrics. We also summarize limitations of current methods and present open questions for future research.

Download Full-text

Constrained Markov Decision Process Modeling for Optimal Sensing of Cardiac Events in Mobile Health

IEEE Transactions on Automation Science and Engineering ◽

10.1109/tase.2021.3052483 ◽

2021 ◽

pp. 1-13

Author(s):

Bing Yao ◽

Yun Chen ◽

Hui Yang

Keyword(s):

Markov Decision Process ◽

Mobile Health ◽

Process Modeling ◽

Decision Process ◽

Cardiac Events ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Controllable Summarization with Constrained Markov Decision Process

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00423 ◽

2021 ◽

Vol 9 ◽

pp. 1213-1232

Author(s):

Hou Pong Chan ◽

Lu Wang ◽

Irwin King

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Gain Control ◽

Text Summarization ◽

Reward Function ◽

Markov Decision ◽

Constrained Markov Decision Process

Abstract We study controllable text summarization, which allows users to gain control on a particular attribute (e.g., length limit) of the generated summaries. In this work, we propose a novel training framework based on Constrained Markov Decision Process (CMDP), which conveniently includes a reward function along with a set of constraints, to facilitate better summarization control. The reward function encourages the generation to resemble the human-written reference, while the constraints are used to explicitly prevent the generated summaries from violating user-imposed requirements. Our framework can be applied to control important attributes of summarization, including length, covered entities, and abstractiveness, as we devise specific constraints for each of these aspects. Extensive experiments on popular benchmarks show that our CMDP framework helps generate informative summaries while complying with a given attribute’s requirement.1

Download Full-text

Adaptive Virtual Resource Allocation in 5G Network Slicing Using Constrained Markov Decision Process

IEEE Access ◽

10.1109/access.2018.2876544 ◽

2018 ◽

Vol 6 ◽

pp. 61184-61195 ◽

Cited By ~ 10

Author(s):

Lun Tang ◽

Qi Tan ◽

Yingjie Shi ◽

Chenmeng Wang ◽

Qianbin Chen

Keyword(s):

Resource Allocation ◽

Markov Decision Process ◽

Decision Process ◽

Network Slicing ◽

5G Network ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Constrained Markov Decision Process Modeling for Sequential Optimization of Additive Manufacturing Build Quality

IEEE Access ◽

10.1109/access.2018.2872391 ◽

2018 ◽

Vol 6 ◽

pp. 54786-54794 ◽

Cited By ~ 3

Author(s):

Bing Yao ◽

Hui Yang

Keyword(s):

Additive Manufacturing ◽

Markov Decision Process ◽

Process Modeling ◽

Decision Process ◽

Sequential Optimization ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Improving Real-Time Bidding Using a Constrained Markov Decision Process

Advanced Data Mining and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-69179-4_50 ◽

2017 ◽

pp. 711-726 ◽

Cited By ~ 3

Author(s):

Manxing Du ◽

Redouane Sassioui ◽

Georgios Varisteas ◽

Radu State ◽

Mats Brorsson ◽

...

Keyword(s):

Real Time ◽

Markov Decision Process ◽

Decision Process ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

A Constrained Markov Decision Process for Flight Safety Assessment and Management

AIAA Infotech @ Aerospace ◽

10.2514/6.2015-0115 ◽

2015 ◽

Cited By ~ 4

Author(s):

Sweewarman Balachandran ◽

Ella M. Atkins

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Safety Assessment ◽

Flight Safety ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Balancing Long Lifetime and Satisfying Fairness in WBAN Using a Constrained Markov Decision Process

International Journal of Antennas and Propagation ◽

10.1155/2015/657854 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Yingqi Yin ◽

Fengye Hu ◽

Ling Cen ◽

Yu Du ◽

Lu Wang

Keyword(s):

Markov Decision Process ◽

Network Lifetime ◽

Decision Process ◽

Scheduling Algorithm ◽

Wireless Body Area Network ◽

Sensor Nodes ◽

Distributed Scheduling ◽

Area Network ◽

Markov Decision ◽

Constrained Markov Decision Process

As an important part of the Internet of Things (IOT) and the special case of device-to-device (D2D) communication, wireless body area network (WBAN) gradually becomes the focus of attention. Since WBAN is a body-centered network, the energy of sensor nodes is strictly restrained since they are supplied by battery with limited power. In each data collection, only one sensor node is scheduled to transmit its measurements directly to the access point (AP) through the fading channel. We formulate the problem of dynamically choosing which sensor should communicate with the AP to maximize network lifetime under the constraint of fairness as a constrained markov decision process (CMDP). The optimal lifetime and optimal policy are obtained by Bellman equation in dynamic programming. The proposed algorithm defines the limiting performance in WBAN lifetime under different degrees of fairness constraints. Due to the defect of large implementation overhead in acquiring global channel state information (CSI), we put forward a distributed scheduling algorithm that adopts local CSI, which saves the network overhead and simplifies the algorithm. It was demonstrated via simulation that this scheduling algorithm can allocate time slot reasonably under different channel conditions to balance the performances of network lifetime and fairness.

Download Full-text

Interference management for cognitive radio systems exploiting primary IR-HARQ: A Constrained Markov Decision Process approach

2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR) ◽

10.1109/acssc.2012.6489349 ◽

2012 ◽

Cited By ~ 10

Author(s):

Romain Tajan ◽

Charly Poulliat ◽

Inbar Fijalkow

Keyword(s):

Cognitive Radio ◽

Markov Decision Process ◽

Decision Process ◽

Interference Management ◽

Process Approach ◽

Cognitive Radio Systems ◽

Markov Decision ◽

Radio Systems ◽

Constrained Markov Decision Process

Download Full-text

constrained markov decision process
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A UoI-Optimal Policy for Timely Status Updates with Resource Constraint

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Constrained Markov Decision Process Modeling for Optimal Sensing of Cardiac Events in Mobile Health

Controllable Summarization with Constrained Markov Decision Process

Adaptive Virtual Resource Allocation in 5G Network Slicing Using Constrained Markov Decision Process

Constrained Markov Decision Process Modeling for Sequential Optimization of Additive Manufacturing Build Quality

Improving Real-Time Bidding Using a Constrained Markov Decision Process

A Constrained Markov Decision Process for Flight Safety Assessment and Management

Balancing Long Lifetime and Satisfying Fairness in WBAN Using a Constrained Markov Decision Process

Interference management for cognitive radio systems exploiting primary IR-HARQ: A Constrained Markov Decision Process approach

Export Citation Format

constrained markov decision processRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A UoI-Optimal Policy for Timely Status Updates with Resource Constraint

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Constrained Markov Decision Process Modeling for Optimal Sensing of Cardiac Events in Mobile Health

Controllable Summarization with Constrained Markov Decision Process

Adaptive Virtual Resource Allocation in 5G Network Slicing Using Constrained Markov Decision Process

Constrained Markov Decision Process Modeling for Sequential Optimization of Additive Manufacturing Build Quality

Improving Real-Time Bidding Using a Constrained Markov Decision Process

A Constrained Markov Decision Process for Flight Safety Assessment and Management

Balancing Long Lifetime and Satisfying Fairness in WBAN Using a Constrained Markov Decision Process

Interference management for cognitive radio systems exploiting primary IR-HARQ: A Constrained Markov Decision Process approach

constrained markov decision process
Recently Published Documents