A UoI-Optimal Policy for Timely Status Updates with Resource Constraint

Lehan Wang; Jingzhou Sun; Yuxuan Sun; Sheng Zhou; Zhisheng Niu

doi:10.3390/e23081084

A UoI-Optimal Policy for Timely Status Updates with Resource Constraint

Entropy ◽

10.3390/e23081084 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1084

Author(s):

Lehan Wang ◽

Jingzhou Sun ◽

Yuxuan Sun ◽

Sheng Zhou ◽

Zhisheng Niu

Keyword(s):

Optimal Policy ◽

Estimation Error ◽

Autonomous Driving ◽

Resource Constraint ◽

Context Aware ◽

Scheduling Policy ◽

Industrial Internet ◽

Markov Decision ◽

Advanced Knowledge ◽

Constrained Markov Decision Process

Timely status updates are critical in remote control systems such as autonomous driving and the industrial Internet of Things, where timeliness requirements are usually context dependent. Accordingly, the Urgency of Information (UoI) has been proposed beyond the well-known Age of Information (AoI) by further including context-aware weights which indicate whether the monitored process is in an emergency. However, the optimal updating and scheduling strategies in terms of UoI remain open. In this paper, we propose a UoI-optimal updating policy for timely status information with resource constraint. We first formulate the problem in a constrained Markov decision process and prove that the UoI-optimal policy has a threshold structure. When the context-aware weights are known, we propose a numerical method based on linear programming. When the weights are unknown, we further design a reinforcement learning (RL)-based scheduling policy. The simulation reveals that the threshold of the UoI-optimal policy increases as the resource constraint tightens. In addition, the UoI-optimal policy outperforms the AoI-optimal policy in terms of average squared estimation error, and the proposed RL-based updating policy achieves a near-optimal performance without the advanced knowledge of the system model.

Download Full-text

Adaptive Virtual Resource Allocation in 5G Network Slicing Using Constrained Markov Decision Process

IEEE Access ◽

10.1109/access.2018.2876544 ◽

2018 ◽

Vol 6 ◽

pp. 61184-61195 ◽

Cited By ~ 10

Author(s):

Lun Tang ◽

Qi Tan ◽

Yingjie Shi ◽

Chenmeng Wang ◽

Qianbin Chen

Keyword(s):

Resource Allocation ◽

Markov Decision Process ◽

Decision Process ◽

Network Slicing ◽

5G Network ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Structural Results on Optimal Transmission Scheduling over Dynamical Fading Channels: A Constrained Markov Decision Process Approach

Wireless Communications - The IMA Volumes in Mathematics and its Applications ◽

10.1007/978-0-387-48945-2_4 ◽

2007 ◽

pp. 75-98 ◽

Cited By ~ 5

Author(s):

Dejan V. Djonin ◽

Vikram Krishnamurthy

Keyword(s):

Fading Channels ◽

Markov Decision Process ◽

Decision Process ◽

Process Approach ◽

Transmission Scheduling ◽

Markov Decision ◽

Constrained Markov Decision Process

Download Full-text

Integration of Wireless Communication Capabilities to Enable Context Aware Industrial Internet of Thing Environments

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Industrial IoT Technologies and Applications ◽

10.1007/978-3-030-71061-3_10 ◽

2021 ◽

pp. 162-170

Author(s):

Imanol Picallo ◽

Peio López Iturri ◽

Mikel Celaya-Echarri ◽

Leyre Azpilicueta ◽

Francisco Falcone

Keyword(s):

Wireless Communication ◽

Context Aware ◽

Industrial Internet ◽

Internet Of Thing

Download Full-text

Injection Mold Production Sustainable Scheduling Using Deep Reinforcement Learning

Sustainability ◽

10.3390/su12208718 ◽

2020 ◽

Vol 12 (20) ◽

pp. 8718 ◽

Cited By ~ 1

Author(s):

Seunghoon Lee ◽

Yongju Cho ◽

Young Hoon Lee

Keyword(s):

Reinforcement Learning ◽

Production Scheduling ◽

Dynamic Environment ◽

Total Weighted Tardiness ◽

Injection Mold ◽

Weighted Tardiness ◽

Scheduling Policy ◽

Delivery Date ◽

Process Framework ◽

Markov Decision

In the injection mold industry, it is important for manufacturers to satisfy the delivery date for the products that customers order. The mold products are diverse, and each product has a different manufacturing process. Owing to the nature of mold, mold manufacturing is a complex and dynamic environment. To meet the delivery date of the customers, the scheduling of mold production is important and is required to be sustainable and intelligent even in the complicated system and dynamic situation. To address this, in this paper, deep reinforcement learning (RL) is proposed for injection mold production scheduling. Before presenting the RL algorithm, a mathematical model for the mold scheduling problem is presented, and a Markov decision process framework is proposed for RL. The deep Q-network, which is an algorithm for RL, is employed to find the scheduling policy to minimize the total weighted tardiness. The results of experiments demonstrate that the proposed deep RL method outperforms the dispatching rules that are presented for minimizing the total weighted tardiness.

Download Full-text

An industrial Internet of things based platform for context-aware information services in manufacturing

International Journal of Computer Integrated Manufacturing ◽

10.1080/0951192x.2018.1500716 ◽

2018 ◽

Vol 31 (11) ◽

pp. 1111-1123 ◽

Cited By ~ 14

Author(s):

Kosmas Alexopoulos ◽

Konstantinos Sipsas ◽

Evangelos Xanthakis ◽

Sotiris Makris ◽

Dimitris Mourtzis

Keyword(s):

Internet Of Things ◽

Information Services ◽

Context Aware ◽

Industrial Internet Of Things ◽

Industrial Internet

Download Full-text

Learning algorithms for Markov decision processes

Journal of Applied Probability ◽

10.1017/s0021900200030825 ◽

1987 ◽

Vol 24 (01) ◽

pp. 270-276

Author(s):

Masami Kurano

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Learning Algorithm ◽

Learning Algorithms ◽

Decision Processes ◽

The State ◽

Reward Structure ◽

Adaptive Policy ◽

Markov Decision ◽

Reward Criterion

This study is concerned with finite Markov decision processes whose dynamics and reward structure are unknown but the state is observable exactly. We establish a learning algorithm which yields an optimal policy and construct an adaptive policy which is optimal under the average expected reward criterion.

Download Full-text

Learning real-time scheduling rules from optimal policy of semi-Markov decision processes

International Journal of Computer Integrated Manufacturing ◽

10.1080/09511929208944526 ◽

1992 ◽

Vol 5 (3) ◽

pp. 171-181 ◽

Cited By ~ 13

Author(s):

YUEHWERN YIH

Keyword(s):

Real Time ◽

Markov Decision Processes ◽

Optimal Policy ◽

Decision Processes ◽

Real Time Scheduling ◽

Markov Decision ◽

Time Scheduling

Download Full-text

Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model

Advances in Applied Probability ◽

10.2307/1426437 ◽

1983 ◽

Vol 15 (2) ◽

pp. 274-303 ◽

Cited By ~ 28

Author(s):

Arie Hordijk ◽

Frank A. Van Der Duyn Schouten

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Sufficient Conditions ◽

Decision Processes ◽

Time Parameter ◽

Queueing Model ◽

Replacement Model ◽

Optimal Policies ◽

Markov Decision

Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a ‘limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some well-known results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.

Download Full-text

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Advances in Applied Probability ◽

10.1239/aap/1370870127 ◽

2013 ◽

Vol 45 (2) ◽

pp. 490-519 ◽

Cited By ~ 4

Author(s):

Xianping Guo ◽

Mantas Vykertas ◽

Yi Zhang

Keyword(s):

Markov Decision Processes ◽

Optimal Policy ◽

Continuous Time ◽

Strong Duality ◽

Performance Measure ◽

Decision Processes ◽

Linear Programs ◽

Constrained Problems ◽

Markov Decision ◽

Unconstrained Problem

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems with N constraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more than N+1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

Download Full-text

On the control of a truncated general immigration process through the introduction of a predator

Journal of Applied Mathematics and Decision Sciences ◽

10.1155/jamds/2006/76398 ◽

2006 ◽

Vol 2006 ◽

pp. 1-12

Author(s):

E. G. Kyriakidis

Keyword(s):

Real Line ◽

Optimal Policy ◽

Average Cost ◽

Control Limit ◽

Decision Algorithm ◽

Mild Conditions ◽

Entire Real Line ◽

Immigration Process ◽

Markov Decision

This paper is concerned with the problem of controlling a truncated general immigration process, which represents a population of harmful individuals, by the introduction of a predator. If the parameters of the model satisfy some mild conditions, the existence of a control-limit policy that is average-cost optimal is proved. The proof is based on the uniformization technique and on the variation of a fictitious parameter over the entire real line. Furthermore, an efficient Markov decision algorithm is developed that generates a sequence of improving control-limit policies converging to the optimal policy.

Download Full-text