A First Step towards the Runtime Analysis of Evolutionary Algorithm Adjusted with Reinforcement Learning

For stochastic multi-objective combinatorial optimization (SMOCO) problems, the adaptive Pareto sampling (APS) framework has been proposed, which is based on sampling and on the solution of deterministic multi-objective subproblems. We show that when plugging in the well-known simple evolutionary multi-objective optimizer (SEMO) as a subprocedure into APS, ε-dominance has to be used to achieve fast convergence to the Pareto front. Two general theorems are presented indicating how runtime complexity results for APS can be derived from corresponding results for SEMO. This may be a starting point for the runtime analysis of evolutionary SMOCO algorithms.

Download Full-text

Improved Evolutionary Algorithm Design for the Project Scheduling Problem Based on Runtime Analysis

IEEE Transactions on Software Engineering ◽

10.1109/tse.2013.52 ◽

2014 ◽

Vol 40 (1) ◽

pp. 83-102 ◽

Cited By ~ 24

Author(s):

Leandro L. Minku ◽

Dirk Sudholt ◽

Xin Yao

Keyword(s):

Evolutionary Algorithm ◽

Project Scheduling ◽

Algorithm Design ◽

Runtime Analysis ◽

Scheduling Problem ◽

Project Scheduling Problem

Download Full-text

A Graph-Based Evolutionary Algorithm: Genetic Network Programming (GNP) and Its Extension Using Reinforcement Learning

Evolutionary Computation ◽

10.1162/evco.2007.15.3.369 ◽

2007 ◽

Vol 15 (3) ◽

pp. 369-398 ◽

Cited By ~ 201

Author(s):

Shingo Mabu ◽

Kotaro Hirasawa ◽

Jinglu Hu

Keyword(s):

Reinforcement Learning ◽

Evolutionary Algorithm ◽

Past History ◽

Structural Characteristics ◽

Memory Function ◽

Genetic Network ◽

Dynamic Environments ◽

Network Programming ◽

History Of ◽

Genetic Network Programming

This paper proposes a graph-based evolutionary algorithm called Genetic Network Programming (GNP). Our goal is to develop GNP, which can deal with dynamic environments efficiently and effectively, based on the distinguished expression ability of the graph (network) structure. The characteristics of GNP are as follows. 1) GNP programs are composed of a number of nodes which execute simple judgment/processing, and these nodes are connected by directed links to each other. 2) The graph structure enables GNP to re-use nodes, thus the structure can be very compact. 3) The node transition of GNP is executed according to its node connections without any terminal nodes, thus the past history of the node transition affects the current node to be used and this characteristic works as an implicit memory function. These structural characteristics are useful for dealing with dynamic environments. Furthermore, we propose an extended algorithm, “GNP with Reinforcement Learning (GNPRL)” which combines evolution and reinforcement learning in order to create effective graph structures and obtain better results in dynamic environments. In this paper, we applied GNP to the problem of determining agents' behavior to evaluate its effectiveness. Tileworld was used as the simulation environment. The results show some advantages for GNP over conventional methods.

Download Full-text