scholarly journals Learning to Tune a Class of Controllers with Deep Reinforcement Learning

Minerals ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. 989
Author(s):  
William John Shipman

Control systems require maintenance in the form of tuning their parameters in order to maximize their performance in the face of process changes in minerals processing circuits. This work focuses on using deep reinforcement learning to train an agent to perform this maintenance continuously. A generic simulation of a first-order process with a time delay, controlled by a proportional-integral controller, was used as the training environment. Domain randomization in this environment was used to aid in generalizing the agent to unseen conditions on a physical circuit. Proximal policy optimization was used to train the agent, and hyper-parameter optimization was performed to select the optimal agent neural network size and training algorithm parameters. Two agents were tested, examining the impact of the observation space used by the agent and concluding that the best observation consists of the parameters of an auto-regressive with exogenous input model fitted to the measurements of the controlled variable. The best trained agent was deployed at an industrial comminution circuit where it was tested on two flow rate control loops. This agent improved the performance of one of these control loops but decreased the performance of the other control loop. While deep reinforcement learning does show promise in controller tuning, several challenges and directions for further study have been identified.

2021 ◽  
Vol 11 (4) ◽  
pp. 1514 ◽  
Author(s):  
Quang-Duy Tran ◽  
Sang-Hoon Bae

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.


2021 ◽  
Vol 8 ◽  
Author(s):  
Thomas Nakken Larsen ◽  
Halvor Ødegård Teigen ◽  
Torkel Laache ◽  
Damiano Varagnolo ◽  
Adil Rasheed

Reinforcement Learning (RL) controllers have proved to effectively tackle the dual objectives of path following and collision avoidance. However, finding which RL algorithm setup optimally trades off these two tasks is not necessarily easy. This work proposes a methodology to explore this that leverages analyzing the performance and task-specific behavioral characteristics for a range of RL algorithms applied to path-following and collision-avoidance for underactuated surface vehicles in environments of increasing complexity. Compared to the introduced RL algorithms, the results show that the Proximal Policy Optimization (PPO) algorithm exhibits superior robustness to changes in the environment complexity, the reward function, and when generalized to environments with a considerable domain gap from the training environment. Whereas the proposed reward function significantly improves the competing algorithms’ ability to solve the training environment, an unexpected consequence of the dimensionality reduction in the sensor suite, combined with the domain gap, is identified as the source of their impaired generalization performance.


2022 ◽  
Vol 70 (1) ◽  
pp. 53-66
Author(s):  
Julian Grothoff ◽  
Nicolas Camargo Torres ◽  
Tobias Kleinert

Abstract Machine learning and particularly reinforcement learning methods may be applied to control tasks ranging from single control loops to the operation of whole production plants. However, their utilization in industrial contexts lacks understandability and requires suitable levels of operability and maintainability. In order to asses different application scenarios a simple measure for their complexity is proposed and evaluated on four examples in a simulated palette transport system of a cold rolling mill. The measure is based on the size of controller input and output space determined by different granularity levels in a hierarchical process control model. The impact of these decomposition strategies on system characteristics, especially operability and maintainability, are discussed, assuming solvability and a suitable quality of the reinforcement learning solution is provided.


2020 ◽  
Vol 23 (7) ◽  
pp. 777-799
Author(s):  
O.I. Shvyreva ◽  
Z.I. Kruglyak ◽  
A.V. Petukh

Subject. This article discusses the issues related to the practice of financial reporting in the face of uncertainties caused by the coronavirus contagion, as well as the specifics of the audit strategy and formation of an audit opinion on this reporting. Objectives. The article aims to identify the quality characteristics of financial reporting prepared in the context of the COVID-19 pandemic and justify the key aspects of assurance engagement completion in an extremely uncertain epidemiological and economic situation. Methods. For the study, we used an abstract-logical method, content analysis techniques, systematization, and classification. Results. Analyzing the impact of the extremely uncertain epidemiological and economic situation on financial statements, the article clarifies aspects of disclosure of events after the reporting date and threats to business continuity in the annual reporting of economic entities. The article identifies possible alternative procedures and algorithms to obtain proper evidence when it is insufficient in the face of the inability to meet certain audit standards requirements in a remote audit environment. The article defines the impact of COVID-19 risk disclosure on the structure of the audit report and opinion. Relevance. The results of the study can be used in the practical activities of economic entities that prepare financial statements in the face of significant uncertainty, as well as auditors and audit organizations.


2020 ◽  
Vol 16 (6) ◽  
pp. 1182-1198
Author(s):  
I.V. Vyakina

Subject. This article deals with the issues related to the national economic security of the State in today's conditions. Objectives. The article aims to develop a set of special measures for additional business support to reduce the impact of restrictions imposed against the background of quarantine and the pandemic spread, and which would help prevent collapse of business entities. Methods. For the study, I used the methods of theoretical, systems, logical, and comparative analyses, and tabular and graphical visualization techniques. Results. The article proposes possible measures to support business aimed at reducing the costs of business entities due to the restrictions caused by the pandemic, that complement and explain the activities proposed by the President and Government of the Russian Federation, taking into account the regional and municipal levels. Conclusions. The uncertain current situation requires constant adjustment and adaptation of public policy in accordance with specific circumstances. Ensuring the country's economic security and sustainability associates with creation of a business organization system that connects public administration tools and business support and development opportunities under the changed environment.


2015 ◽  
Vol 3 (2) ◽  
pp. 69-84
Author(s):  
Wadhah Amer Hatem ◽  
Samiaah M. Hassen Al-Tmeemy

     Suicide attacks, bombings, explosions became the part of daily life in Iraq. Consequently, the threat of terrorism put the Iraqi construction sector in the face of unique and unusual challenges that not seen on other countries. These challenges can have extensive impact on construction projects. This paper seeks to examine the impact of the terrorist attacks on construction industry and determine the extent to which the impact of terrorism on construction projects in terms of cost, schedule, and quality. This study adapted quantitative and qualitative approaches to collect data using questionnaire survey and interviews, as well as historical data. The study focused on projects that have been the target of terrorist strikes in Diyala governorate. A variety of statistical procedures were employed in data analysis. The results revealed the extent to which terrorist attacks impact construction projects in terms of cost, time, and quality. The results of this study will enhance the awareness of all construction parties to the impact of the terrorist attacks against construction projects. Eventually, this can develop a risk management assessment and assist contractors to properly protect projects and buildings to minimize injuries and fatalities in the event of terrorism.


2019 ◽  
Author(s):  
Jennifer R Sadler ◽  
Grace Elisabeth Shearrer ◽  
Nichollette Acosta ◽  
Kyle Stanley Burger

BACKGROUND: Dietary restraint represents an individual’s intent to limit their food intake and has been associated with impaired passive food reinforcement learning. However, the impact of dietary restraint on an active, response dependent learning is poorly understood. In this study, we tested the relationship between dietary restraint and food reinforcement learning using an active, instrumental conditioning task. METHODS: A sample of ninety adults completed a response-dependent instrumental conditioning task with reward and punishment using sweet and bitter tastes. Brain response via functional MRI was measured during the task. Participants also completed anthropometric measures, reward/motivation related questionnaires, and a working memory task. Dietary restraint was assessed via the Dutch Restrained Eating Scale. RESULTS: Two groups were selected from the sample: high restraint (n=29, score >2.5) and low restraint (n=30; score <1.85). High restraint was associated with significantly higher BMI (p=0.003) and lower N-back accuracy (p=0.045). The high restraint group also was marginally better at the instrumental conditioning task (p=0.066, r=0.37). High restraint was also associated with significantly greater brain response in the intracalcarine cortex (MNI: 15, -69, 12; k=35, pfwe< 0.05) to bitter taste, compared to neutral taste.CONCLUSIONS: High restraint was associated with improved performance on an instrumental task testing how individuals learn from reward and punishment. This may be mediated by greater brain response in the primary visual cortex, which has been associated with mental representation. Results suggest that dietary restraint does not impair response-dependent reinforcement learning.


Author(s):  
Kumari Anshu ◽  
Loveleen Gaur ◽  
Arun Solanki

Chatbot has emerged as a significant resolution to the swiftly growing customer caredemands in recent times. Chatbot has emerged as one of the biggest technological disruption. Simply speaking, it is a software agent facilitating interaction between computers and humans in natural language. So basically, it is a simulated, intellectual dialogue agent functional in a range of consumer engagement circumstances. It is the easiest and simplest means enable interaction between the retailers and the customers. </p><p> • Purpose- Most of the research work done in this field is concerned with their technical aspects. The recent research on chatbot pay little attention to the impact it is creating on users’ experience. Through this work, author is making an effort to know the customer-oriented impact that the chatbot bear on the shoppers. The purpose of this study is to develop and empirically test a framework that identify the customer oriented attributes of chatbot and impact of these attributes on customers. </p><p> • Objectives- The study intends to bridge the gap between concepts and actual attributes and applications on the subject of Chatbot. The following research objectives can address the various aspects of Chatbot affecting the different characteristics of consumers shopping behaviors: a) Identify the various attributes of chatbot that bears an impression on consumer shopping behavior. b) Evaluate the impact of chatbot on consumer shopping behavior that leads to the development of chatbot usage and adoption among the customer. </p><p> • Design/Methodology/Approach – For the purpose of analysis, author has administered Factor analysis and Multiple regression using SPSS version 23 for identification of various attributes of Chatbot and knowing their impact on shoppers. A self-administered questionnaire from the review of literature is developed. Industry experts in the field of retailing and academician evaluate the questionnaire. Primary information from the respondents is gathered using this questionnaire. The questionnaire comprises of Likert scale on a scale of 1 to 5 where 1 stands for strongly disagree and 5 stands for strongly agree. Data is collected from 126 respondents, out of which 111 respondents were finally considered for study and analysis purpose. </p><p> • Findings – The empirical results show that the study identifies various attributes of chatbot like Trust, Usefulness, Satisfaction, Readiness to Use and Accessibility. It is also found that chatbot is really influencing the customers in providing them with shopping experience, which can be very helpful to the businesses for increasing the sales and creating repurchase intention among the customers. </p><p> • Originality/value – The recent research on chatbot pay little attention to the impact it is creating on customers who are actually interacting with it on regular basis. The research paper extends information for understanding and appreciating the customer oriented attributes of artificially intelligent Chatbot. In this regard, the author has developed a model framework and proposed the attributes identified. Through the work, author is also making an effort to test empirically the impact of the identified attributes on the shoppers.


Author(s):  
Lisa Herzog

This chapter asks whether we can hold on to the picture of the morally responsible subject as we knew it in the face of evidence from social psychology about the impact of contexts on human behaviour. Some theorists have taken this to present a major challenge to moral theorizing. However, the chapter argues that, while we should acknowledge the malleability of human behaviour, we should not give up the notion of responsible agency. Rather, we need to broaden our theoretical horizon in order to include individuals’ co-responsibility for the contexts in which they act. This argument is a general one, but it is of particular relevance for organizations: it is our shared responsibility to turn them into contexts in which moral agency is supported rather than undermined.


Biomimetics ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 13
Author(s):  
Adam Bignold ◽  
Francisco Cruz ◽  
Richard Dazeley ◽  
Peter Vamplew ◽  
Cameron Foale

Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent.


Sign in / Sign up

Export Citation Format

Share Document