scholarly journals Differentially Private Actor and Its Eligibility Trace

Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1486
Author(s):  
Kanghyeon Seo ◽  
Jihoon Yang

We present a differentially private actor and its eligibility trace in an actor-critic approach, wherein an actor takes actions directly interacting with an environment; however, the critic estimates only the state values that are obtained through bootstrapping. In other words, the actor reflects the more detailed information about the sequence of taken actions on its parameter than the critic. Moreover, their corresponding eligibility traces have the same properties. Therefore, it is necessary to preserve the privacy of an actor and its eligibility trace while training on private or sensitive data. In this paper, we confirm the applicability of differential privacy methods to the actors updated using the policy gradient algorithm and discuss the advantages of such an approach with regard to differentially private critic learning. In addition, we measured the cosine similarity between the differentially private applied eligibility trace and the non-differentially private eligibility trace to analyze whether their anonymity is appropriately protected in the differentially private actor or the critic. We conducted the experiments considering two synthetic examples imitating real-world problems in medical and autonomous navigation domains, and the results confirmed the feasibility of the proposed method.

2021 ◽  
Vol 17 (12) ◽  
pp. 155014772110599
Author(s):  
Lin Wang ◽  
Xingang Xu ◽  
Xuhui Zhao ◽  
Baozhu Li ◽  
Ruijuan Zheng ◽  
...  

Policy gradient methods are effective means to solve the problems of mobile multimedia data transmission in Content Centric Networks. Current policy gradient algorithms impose high computational cost in processing high-dimensional data. Meanwhile, the issue of privacy disclosure has not been taken into account. However, privacy protection is important in data training. Therefore, we propose a randomized block policy gradient algorithm with differential privacy. In order to reduce computational complexity when processing high-dimensional data, we randomly select a block coordinate to update the gradients at each round. To solve the privacy protection problem, we add a differential privacy protection mechanism to the algorithm, and we prove that it preserves the [Formula: see text]-privacy level. We conduct extensive simulations in four environments, which are CartPole, Walker, HalfCheetah, and Hopper. Compared with the methods such as important-sampling momentum-based policy gradient, Hessian-Aided momentum-based policy gradient, REINFORCE, the experimental results of our algorithm show a faster convergence rate than others in the same environment.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Weisan Wu

In this paper, we give a modified gradient EM algorithm; it can protect the privacy of sensitive data by adding discrete Gaussian mechanism noise. Specifically, it makes the high-dimensional data easier to process mainly by scaling, truncating, noise multiplication, and smoothing steps on the data. Since the variance of discrete Gaussian is smaller than that of the continuous Gaussian, the difference privacy of data can be guaranteed more effectively by adding the noise of the discrete Gaussian mechanism. Finally, the standard gradient EM algorithm, clipped algorithm, and our algorithm (DG-EM) are compared with the GMM model. The experiments show that our algorithm can effectively protect high-dimensional sensitive data.


Author(s):  
Arambam James Singh ◽  
Duc Thien Nguyen ◽  
Akshat Kumar ◽  
Hoong Chuin Lau

We address the problem of maritime traffic management in busy waterways to increase the safety of navigation by reducing congestion. We model maritime traffic as a large multiagent systems with individual vessels as agents, and VTS authority as the regulatory agent. We develop a maritime traffic simulator based on historical traffic data that incorporates realistic domain constraints such as uncertain and asynchronous movement of vessels. We also develop a traffic coordination approach that provides speed recommendation to vessels in different zones. We exploit the nature of collective interactions among agents to develop a scalable policy gradient approach that can scale up to real world problems. Empirical results on synthetic and real world problems show that our approach can significantly reduce congestion while keeping the traffic throughput high.


2021 ◽  
Vol 13 (10) ◽  
pp. 5491
Author(s):  
Melissa Robson-Williams ◽  
Bruce Small ◽  
Roger Robson-Williams ◽  
Nick Kirk

The socio-environmental challenges the world faces are ‘swamps’: situations that are messy, complex, and uncertain. The aim of this paper is to help disciplinary scientists navigate these swamps. To achieve this, the paper evaluates an integrative framework designed for researching complex real-world problems, the Integration and Implementation Science (i2S) framework. As a pilot study, we examine seven inter and transdisciplinary agri-environmental case studies against the concepts presented in the i2S framework, and we hypothesise that considering concepts in the i2S framework during the planning and delivery of agri-environmental research will increase the usefulness of the research for next users. We found that for the types of complex, real-world research done in the case studies, increasing attention to the i2S dimensions correlated with increased usefulness for the end users. We conclude that using the i2S framework could provide handrails for researchers, to help them navigate the swamps when engaging with the complexity of socio-environmental problems.


Mathematics ◽  
2021 ◽  
Vol 9 (5) ◽  
pp. 534
Author(s):  
F. Thomas Bruss

This paper presents two-person games involving optimal stopping. As far as we are aware, the type of problems we study are new. We confine our interest to such games in discrete time. Two players are to chose, with randomised choice-priority, between two games G1 and G2. Each game consists of two parts with well-defined targets. Each part consists of a sequence of random variables which determines when the decisive part of the game will begin. In each game, the horizon is bounded, and if the two parts are not finished within the horizon, the game is lost by definition. Otherwise the decisive part begins, on which each player is entitled to apply their or her strategy to reach the second target. If only one player achieves the two targets, this player is the winner. If both win or both lose, the outcome is seen as “deuce”. We motivate the interest of such problems in the context of real-world problems. A few representative problems are solved in detail. The main objective of this article is to serve as a preliminary manual to guide through possible approaches and to discuss under which circumstances we can obtain solutions, or approximate solutions.


2021 ◽  
Vol 52 (1) ◽  
pp. 12-15
Author(s):  
S.V. Nagaraj

This book is on algorithms for network flows. Network flow problems are optimization problems where given a flow network, the aim is to construct a flow that respects the capacity constraints of the edges of the network, so that incoming flow equals the outgoing flow for all vertices of the network except designated vertices known as the source and the sink. Network flow algorithms solve many real-world problems. This book is intended to serve graduate students and as a reference. The book is also available in eBook (ISBN 9781316952894/US$ 32.00), and hardback (ISBN 9781107185890/US$99.99) formats. The book has a companion web site www.networkflowalgs.com where a pre-publication version of the book can be downloaded gratis.


AI Matters ◽  
2019 ◽  
Vol 5 (3) ◽  
pp. 12-14
Author(s):  
Tara Chklovski

Sign in / Sign up

Export Citation Format

Share Document