scholarly journals Bias-variance decomposition in Genetic Programming

2016 ◽  
Vol 14 (1) ◽  
pp. 62-80 ◽  
Author(s):  
Taras Kowaliw ◽  
René Doursat

AbstractWe study properties of Linear Genetic Programming (LGP) through several regression and classification benchmarks. In each problem, we decompose the results into bias and variance components, and explore the effect of varying certain key parameters on the overall error and its decomposed contributions. These parameters are the maximum program size, the initial population, and the function set used. We confirm and quantify several insights into the practical usage of GP, most notably that (a) the variance between runs is primarily due to initialization rather than the selection of training samples, (b) parameters can be reasonably optimized to obtain gains in efficacy, and (c) functions detrimental to evolvability are easily eliminated, while functions well-suited to the problem can greatly improve performance—therefore, larger and more diverse function sets are always preferable.

2021 ◽  
pp. 1-23
Author(s):  
Léo Françoso Dal Piccol Sotto ◽  
Franz Rothlauf ◽  
Vinçcius Veloso de Melo ◽  
Márcio P. Basgalupp

Abstract Linear Genetic Programming (LGP) represents programs as sequences of instructions and has a Directed Acyclic Graph (DAG) dataflow. The results of instructions are stored in registers that can be used as arguments by other instructions. Instructions that are disconnected from the main part of the program are called non-effective instructions, or structural introns. They also appear in other DAG-based GP approaches like Cartesian Genetic Programming (CGP). This paper studies four hypotheses on the role of structural introns: non-effective instructions (1) serve as evolutionary memory, where evolved information is stored and later used in search, (2) preserve population diversity, (3) allow neutral search, where structural introns increase the number of neutral mutations and improve performance, and (4) serve as genetic material to enable program growth. We study different variants of LGP controlling the influence of introns for symbolic regression, classification, and digital circuits problems. We find that there is (1) evolved information in the non-effective instructions that can be reactivated and that (2) structural introns can promote programs with higher effective diversity. However, both effects have no influence on LGP search performance. On the other hand, allowing mutations to not only be applied to effective but also to noneffective instructions (3) increases the rate of neutral mutations and (4) contributes to program growth by making use of the genetic material available as structural introns. This comes along with a significant increase of LGP performance, which makes structural introns important for LGP.


Author(s):  
Luis Moya ◽  
Christian Geiss ◽  
Masakazu Hashimoto ◽  
Erick Mas ◽  
Shunichi Koshimura ◽  
...  

2020 ◽  
Vol 37 (7) ◽  
pp. 2517-2537
Author(s):  
Mostafa Rezvani Sharif ◽  
Seyed Mohammad Reza Sadri Tabaei Zavareh

Purpose The shear strength of reinforced concrete (RC) columns under cyclic lateral loading is a crucial concern, particularly, in the seismic design of RC structures. Considering the costly procedure of testing methods for measuring the real value of the shear strength factor and the existence of several parameters impacting the system behavior, numerical modeling techniques have been very much appreciated by engineers and researchers. This study aims to propose a new model for estimation of the shear strength of cyclically loaded circular RC columns through a robust computational intelligence approach, namely, linear genetic programming (LGP). Design/methodology/approach LGP is a data-driven self-adaptive algorithm recently used for classification, pattern recognition and numerical modeling of engineering problems. A reliable database consisting of 64 experimental data is collected for the development of shear strength LGP models here. The obtained models are evaluated from both engineering and accuracy perspectives by means of several indicators and supplementary studies and the optimal model is presented for further purposes. Additionally, the capability of LGP is examined to be used as an alternative approach for the numerical analysis of engineering problems. Findings A new predictive model is proposed for the estimation of the shear strength of cyclically loaded circular RC columns using the LGP approach. To demonstrate the capability of the proposed model, the analysis results are compared to those obtained by some well-known models recommended in the existing literature. The results confirm the potential of the LGP approach for numerical analysis of engineering problems in addition to the fact that the obtained LGP model outperforms existing models in estimation and predictability. Originality/value This paper mainly represents the capability of the LGP approach as a robust alternative approach among existing analytical and numerical methods for modeling and analysis of relevant engineering approximation and estimation problems. The authors are confident that the shear strength model proposed can be used for design and pre-design aims. The authors also declare that they have no conflict of interest.


2003 ◽  
Vol 11 (2) ◽  
pp. 169-206 ◽  
Author(s):  
Riccardo Poli ◽  
Nicholas Freitag McPhee

This paper is the second part of a two-part paper which introduces a general schema theory for genetic programming (GP) with subtree-swapping crossover (Part I (Poli and McPhee, 2003)). Like other recent GP schema theory results, the theory gives an exact formulation (rather than a lower bound) for the expected number of instances of a schema at the next generation. The theory is based on a Cartesian node reference system, introduced in Part I, and on the notion of a variable-arity hyperschema, introduced here, which generalises previous definitions of a schema. The theory includes two main theorems describing the propagation of GP schemata: a microscopic and a macroscopic schema theorem. The microscopic version is applicable to crossover operators which replace a subtree in one parent with a subtree from the other parent to produce the offspring. Therefore, this theorem is applicable to Koza's GP crossover with and without uniform selection of the crossover points, as well as one-point crossover, size-fair crossover, strongly-typed GP crossover, context-preserving crossover and many others. The macroscopic version is applicable to crossover operators in which the probability of selecting any two crossover points in the parents depends only on the parents' size and shape. In the paper we provide examples, we show how the theory can be specialised to specific crossover operators and we illustrate how it can be used to derive other general results. These include an exact definition of effective fitness and a size-evolution equation for GP with subtree-swapping crossover.


2009 ◽  
Vol 18 (05) ◽  
pp. 757-781 ◽  
Author(s):  
CÉSAR L. ALONSO ◽  
JOSÉ LUIS MONTAÑA ◽  
JORGE PUENTE ◽  
CRUZ ENRIQUE BORGES

Tree encodings of programs are well known for their representative power and are used very often in Genetic Programming. In this paper we experiment with a new data structure, named straight line program (slp), to represent computer programs. The main features of this structure are described, new recombination operators for GP related to slp's are introduced and a study of the Vapnik-Chervonenkis dimension of families of slp's is done. Experiments have been performed on symbolic regression problems. Results are encouraging and suggest that the GP approach based on slp's consistently outperforms conventional GP based on tree structured representations.


2017 ◽  
Vol 58 (8) ◽  
Author(s):  
Ruiying Li ◽  
Bernd R. Noack ◽  
Laurent Cordier ◽  
Jacques Borée ◽  
Fabien Harambat

Sign in / Sign up

Export Citation Format

Share Document