scholarly journals Genetic programming as a model induction engine

2000 ◽  
Vol 2 (1) ◽  
pp. 35-60 ◽  
Author(s):  
Vladan Babovic ◽  
Maarten Keijzer

Present day instrumentation networks already provide immense quantities of data, very little of which provides any insights into the basic physical processes that are occurring in the measured medium. This is to say that the data by itself contributes little to the knowledge of such processes. Data mining and knowledge discovery aim to change this situation by providing technologies that will greatly facilitate the mining of data for knowledge. In this new setting the role of a human expert is to provide domain knowledge, interpret models suggested by the computer and devise further experiments that will provide even better data coverage. Clearly, there is an enormous amount of knowledge and understanding of physical processes that should not be just thrown away. Consequently, we strongly believe that the most appropriate way forward is to combine the best of the two approaches: theory-driven, understanding-rich with data-driven discovery process. This paper describes a particular knowledge discovery algorithm—Genetic Programming (GP). Additionally, an augmented version of GP—dimensionally aware GP—which is arguably more useful in the process of scientific discovery is described in great detail. Finally, the paper concludes with an application of dimensionally aware GP to a problem of induction of an empirical relationship describing the additional resistance to flow induced by flexible vegetation.

2004 ◽  
Vol 6 (3) ◽  
pp. 157-173 ◽  
Author(s):  
Orazio Giustolisi

Genetic Programming has been used to determine Chèzy resistance coefficient for full circular corrugated channels. Three corrugated plastic pipes have been experimentally studied in order to generate data. The tests aim at measuring hydraulic parameters of the open-channel flow for some slopes, from 3.49–17.37% (2–10°), in order to discover the dependence of the channel resistance coefficient when wake-interference flow occurs. The monomial formula for the Chèzy resistance coefficient performs well on experimental data, both from measurement errors and from a technical point of view. In this paper, we present some very parsimonious formulae that have been created by Genetic Programming with few constants and which fit the data better than the monomial formula. Moreover, two of the Genetic Programming formulae, after ‘physical post-refinement’, seem to better explain the role of the roughness in the Chèzy resistance coefficient for corrugated channels with respect to its traditional expression for rough channels. This fact suggests that at least the structure of those formulae can be extrapolated to other types of corrugated channels. Finally, the work stresses the fact that the Genetic Programming hypothesis can be easily manipulated by means of ‘human’ physical insight. Therefore, Genetic Programming should be considered more than a simple data-driven technique, especially when it is used to perform scientific discovery.


2019 ◽  
Vol 32 (14) ◽  
pp. 4215-4234 ◽  
Author(s):  
Qin Su ◽  
Buwen Dong

Abstract Observational analysis indicates significant decadal changes in daytime, nighttime, and compound (both daytime and nighttime) heat waves (HWs) over China across the mid-1990s, featuring a rapid increase in frequency, intensity, and spatial extent. The variations of these observed decadal changes are assessed by the comparison between the present day (PD) of 1994–2011 and the early period (EP) of 1964–81. The compound HWs change most remarkably in all three aspects, with frequency averaged over China in the PD tripling that in the EP and intensity and spatial extent nearly doubling. The daytime and nighttime HWs also change significantly in all three aspects. A set of numerical experiments is used to investigate the drivers and physical processes responsible for the decadal changes of the HWs. Results indicate the predominant role of the anthropogenic forcing, including changes in greenhouse gas (GHG) concentrations and anthropogenic aerosol (AA) emissions in the HW decadal changes. The GHG changes have dominant impacts on the three types of HWs, while the AA changes make significant influences on daytime HWs. The GHG changes increase the frequency, intensity, and spatial extent of the three types of HWs over China both directly via the strengthened greenhouse effect and indirectly via land–atmosphere and circulation feedbacks in which GHG-change-induced warming in sea surface temperature plays an important role. The AA changes decrease the frequency and intensity of daytime HWs over Southeastern China through mainly aerosol–radiation interaction, but increase the frequency and intensity of daytime HWs over Northeastern China through AA-change-induced surface–atmosphere feedbacks and dynamical changes related to weakened East Asian summer monsoon.


2013 ◽  
Vol 26 (21) ◽  
pp. 8513-8528 ◽  
Author(s):  
Megan S. Mallard ◽  
Gary M. Lackmann ◽  
Anantha Aiyyer

Abstract A method of downscaling that isolates the effect of temperature and moisture changes on tropical cyclone (TC) activity was presented in Part I of this study. By applying thermodynamic modifications to analyzed initial and boundary conditions from past TC seasons, initial disturbances and the strength of synoptic-scale vertical wind shear are preserved in future simulations. This experimental design allows comparison of TC genesis events in the same synoptic setting, but in current and future thermodynamic environments. Simulations of both an active (September 2005) and inactive (September 2009) portion of past hurricane seasons are presented. An ensemble of high-resolution simulations projects reductions in ensemble-average TC counts between 18% and 24%, consistent with previous studies. Robust decreases in TC and hurricane counts are simulated with 18- and 6-km grid lengths, for both active and inactive periods. Physical processes responsible for reduced activity are examined through comparison of monthly and spatially averaged genesis-relevant parameters, as well as case studies of development of corresponding initial disturbances in current and future thermodynamic conditions. These case studies show that reductions in TC counts are due to the presence of incipient disturbances in marginal moisture environments, where increases in the moist entropy saturation deficits in future conditions preclude genesis for some disturbances. Increased convective inhibition and reduced vertical velocity are also found in the future environment. It is concluded that a robust decrease in TC frequency can result from thermodynamic changes alone, without modification of vertical wind shear or the number of incipient disturbances.


2016 ◽  
Vol 24 (4) ◽  
pp. 667-694 ◽  
Author(s):  
Stjepan Picek ◽  
Claude Carlet ◽  
Sylvain Guilley ◽  
Julian F. Miller ◽  
Domagoj Jakobovic

The role of Boolean functions is prominent in several areas including cryptography, sequences, and coding theory. Therefore, various methods for the construction of Boolean functions with desired properties are of direct interest. New motivations on the role of Boolean functions in cryptography with attendant new properties have emerged over the years. There are still many combinations of design criteria left unexplored and in this matter evolutionary computation can play a distinct role. This article concentrates on two scenarios for the use of Boolean functions in cryptography. The first uses Boolean functions as the source of the nonlinearity in filter and combiner generators. Although relatively well explored using evolutionary algorithms, it still presents an interesting goal in terms of the practical sizes of Boolean functions. The second scenario appeared rather recently where the objective is to find Boolean functions that have various orders of the correlation immunity and minimal Hamming weight. In both these scenarios we see that evolutionary algorithms are able to find high-quality solutions where genetic programming performs the best.


Author(s):  
J. Nichols ◽  
Albert Cohen ◽  
Peter Binev ◽  
Olga Mula

Parametric PDEs of the general form $$ \mathcal{P}(u,a)=0 $$ are commonly used to describe many physical processes, where $\mathcal{P}$ is a differential operator, a is a high-dimensional vector of parameters and u is the unknown solution belonging to some Hilbert space V. Typically one observes m linear measurements of u(a) of the form $\ell_i(u)=\langle w_i,u \rangle$, $i=1,\dots,m$, where $\ell_i\in V'$ and $w_i$ are the Riesz representers, and we write $W_m = \text{span}\{w_1,\ldots,w_m\}$. The goal is to recover an approximation $u^*$ of u from the measurements. The solutions u(a) lie in a manifold within V which we can approximate by a linear space $V_n$, where n is of moderate dimension. The structure of the PDE ensure that for any a the solution is never too far away from $V_n$, that is, $\text{dist}(u(a),V_n)\le \varepsilon$. In this setting, the observed measurements and $V_n$ can be combined to produce an approximation $u^*$ of u up to accuracy $$ \Vert u -u^*\Vert \leq \beta^{-1}(V_n,W_m) \, \varepsilon $$ where $$ \beta(V_n,W_m) := \inf_{v\in V_n} \frac{\Vert P_{W_m}v\Vert}{\Vert v \Vert} $$ plays the role of a stability constant. For a given $V_n$, one relevant objective is to guarantee that $\beta(V_n,W_m)\geq \gamma >0$ with a number of measurements $m\geq n$ as small as possible. We present results in this direction when the measurement functionals $\ell_i$ belong to a complete dictionary.


2020 ◽  
Author(s):  
Harith Al-Sahaf ◽  
A Song ◽  
K Neshatian ◽  
Mengjie Zhang

Image classification is a complex but important task especially in the areas of machine vision and image analysis such as remote sensing and face recognition. One of the challenges in image classification is finding an optimal set of features for a particular task because the choice of features has direct impact on the classification performance. However the goodness of a feature is highly problem dependent and often domain knowledge is required. To address these issues we introduce a Genetic Programming (GP) based image classification method, Two-Tier GP, which directly operates on raw pixels rather than features. The first tier in a classifier is for automatically defining features based on raw image input, while the second tier makes decision. Compared to conventional feature based image classification methods, Two-Tier GP achieved better accuracies on a range of different tasks. Furthermore by using the features defined by the first tier of these Two-Tier GP classifiers, conventional classification methods obtained higher accuracies than classifying on manually designed features. Analysis on evolved Two-Tier image classifiers shows that there are genuine features captured in the programs and the mechanism of achieving high accuracy can be revealed. The Two-Tier GP method has clear advantages in image classification, such as high accuracy, good interpretability and the removal of explicit feature extraction process. © 2012 IEEE.


Sign in / Sign up

Export Citation Format

Share Document