HyperNOMAD

Dounia Lakhmiri; Sébastien Le Digabel; Christophe Tribes

doi:10.1145/3450975

HyperNOMAD

ACM Transactions on Mathematical Software ◽

10.1145/3450975 ◽

2021 ◽

Vol 47 (3) ◽

pp. 1-27

Author(s):

Dounia Lakhmiri ◽

Sébastien Le Digabel ◽

Christophe Tribes

Keyword(s):

Neural Network ◽

Learning Process ◽

Deep Neural Network ◽

Search Space ◽

Categorical Variables ◽

Current State ◽

Derivative Free Optimization ◽

Highly Sensitive ◽

Derivative Free ◽

Application Tuning

The performance of deep neural networks is highly sensitive to the choice of the hyperparameters that define the structure of the network and the learning process. When facing a new application, tuning a deep neural network is a tedious and time-consuming process that is often described as a “dark art.” This explains the necessity of automating the calibration of these hyperparameters. Derivative-free optimization is a field that develops methods designed to optimize time-consuming functions without relying on derivatives. This work introduces the HyperNOMAD package, an extension of the NOMAD software that applies the MADS algorithm [7] to simultaneously tune the hyperparameters responsible for both the architecture and the learning process of a deep neural network (DNN). This generic approach allows for an important flexibility in the exploration of the search space by taking advantage of categorical variables. HyperNOMAD is tested on the MNIST, Fashion-MNIST, and CIFAR-10 datasets and achieves results comparable to the current state of the art.

Download Full-text

Optimization based Layer-wise Magnitude-based Pruning for DNN Compression

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/330 ◽

2018 ◽

Cited By ~ 4

Author(s):

Guiying Li ◽

Chao Qian ◽

Chunhui Jiang ◽

Xiaofen Lu ◽

Ke Tang

Keyword(s):

Deep Neural Network ◽

Optimization Problem ◽

State Of The Art ◽

Automatic Tuning ◽

Constrained Optimization Problem ◽

Derivative Free Optimization ◽

Derivative Free ◽

Threshold Tuning ◽

Model Subject ◽

Pruning Methods

Layer-wise magnitude-based pruning (LMP) is a very popular method for deep neural network (DNN) compression. However, tuning the layer-specific thresholds is a difficult task, since the space of threshold candidates is exponentially large and the evaluation is very expensive. Previous methods are mainly by hand and require expertise. In this paper, we propose an automatic tuning approach based on optimization, named OLMP. The idea is to transform the threshold tuning problem into a constrained optimization problem (i.e., minimizing the size of the pruned model subject to a constraint on the accuracy loss), and then use powerful derivative-free optimization algorithms to solve it. To compress a trained DNN, OLMP is conducted within a new iterative pruning and adjusting pipeline. Empirical results show that OLMP can achieve the best pruning ratio on LeNet-style models (i.e., 114 times for LeNet-300-100 and 298 times for LeNet-5) compared with some state-of-the- art DNN pruning methods, and can reduce the size of an AlexNet-style network up to 82 times without accuracy loss.

Download Full-text

An empirical study on the GEOtop hydrological model optimal estimation and uncertainty reduction using supercomputers

10.5194/egusphere-egu21-15768 ◽

2021 ◽

Author(s):

Giacomo Bertoldi ◽

Stefano Campanella ◽

Emanuele Cordano ◽

Alberto Sartori

Keyword(s):

Learning Community ◽

Hydrological Model ◽

Computing Time ◽

Optimal Estimation ◽

Search Space ◽

Water Retention Curve ◽

Derivative Free Optimization ◽

Derivative Free ◽

Discretionary Choices ◽

Computational Aspects

Proper characterization of uncertainty remains a major research and operational challenge in Earth and Environmental Systems Models (EESMs). In fact, model calibration is often more an art than a science: one must make several discretionary choices, guided more by his own experience and intuition than by the scientific method. In practice, this means that the result of calibration (CA) could be suboptimal. One of the challenges of CA is the large number of parameters involved in EESM, which hence are usually selected with the help of a preliminary sensitivity analysis (SA). Finally, the computational burden of EESMs models and the large volume of the search space make SA and CA very time-consuming processes.This work applies a modern HPC approach to optimize a complex, over parameterized hydrological model, improving the computational efficiency of SA/CA. We apply the derivative-free optimization algorithms implemented in the Facebook Nevergrad Python library (Rapin and Teytaud, 2018) on a HPC cluster, thanks to the Dask framework (Dask Development Team, 2016).The approach has been applied to the GEOtop hydrological model (Rigon et al., 2006; Endrizzi et al., 2014) to predict the time evolution of variables as soil water content and evapotranspiration for several mountain agricultural sites in South Tyrol with different elevation, land cover (pasture, meadow, orchard), soil types.We performed simulations on one-dimensional domains, where the model solves the energy and water budget equations in a column of soil and neglects the lateral water fluxes.&#160; Even neglecting the distribution of parameters across layers of soil, considering a homogeneous column, one has tens of parameters, controlling soil and vegetation properties, where only a few of them are experimentally available.&#160;Because the interpretation of global SA could be difficult or misleading and the number of model evaluations needed by SA is comparable with CA, we employed the following strategy. We performed CA using a full set of continuous parameters and SA after CA, using the samples collected during CA, to interpret the results. However, given the above-mentioned computational challenges, this strategy is possible only using HPC resources. For this reason, we focused on the computational aspects of calibration from an HPC perspective and examined the scaling of these algorithms and their implementation up to 1024 cores on a cluster. Other issues that we had to address were the complex shape of the search space and robustness of CA and SA against model convergence failure.HPC &#160;techniques allow to calibrate models with a high number of parameters within a reasonable computing time and &#160;exploring the parameters space properly. This is particularly important with noisy, multimodal objective functions. In our case, HPC was essential to determine the &#160;parameters controlling the water retention curve, which is highly not linear.&#160; The developed &#160;framework, which is published and freely available on GitHub, shows also how libraries and tools used within the machine learning community could be useful and easily adapted to EESMs CA.

Download Full-text

A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation

JMIR Medical Informatics ◽

10.2196/17125 ◽

2020 ◽

Vol 8 (4) ◽

pp. e17125 ◽

Cited By ~ 1

Author(s):

Louis Falissard ◽

Claire Morgand ◽

Sylvie Roussel ◽

Claire Imbaud ◽

Walid Ghosn ◽

...

Keyword(s):

Neural Network ◽

Cause Of Death ◽

Deep Neural Network ◽

Causes Of Death ◽

State Of The Art ◽

Death Certificates ◽

Current State ◽

Underlying Causes ◽

Artificial Neural ◽

Underlying Cause Of Death

Background Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d’épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. Objective This article investigates the application of deep neural network methods to coding underlying causes of death. Methods The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject’s age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject’s underlying cause of death was then formulated as a predictive modelling problem. A deep neural network−based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach’s superiority was assessed via bootstrap. Results The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. Conclusions This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general.

Download Full-text

Derivative-free optimization adversarial attacks for graph convolutional networks

PeerJ Computer Science ◽

10.7717/peerj-cs.693 ◽

2021 ◽

Vol 7 ◽

pp. e693

Author(s):

Runze Yang ◽

Teng Long

Keyword(s):

Search Space ◽

Classification Performance ◽

Convolutional Networks ◽

Derivative Free Optimization ◽

Derivative Free ◽

Adversarial Examples ◽

Series Of Experiments ◽

Adversarial Attack ◽

Direct Attack

In recent years, graph convolutional networks (GCNs) have emerged rapidly due to their excellent performance in graph data processing. However, recent researches show that GCNs are vulnerable to adversarial attacks. An attacker can maliciously modify edges or nodes of the graph to mislead the model’s classification of the target nodes, or even cause a degradation of the model’s overall classification performance. In this paper, we first propose a black-box adversarial attack framework based on derivative-free optimization (DFO) to generate graph adversarial examples without using gradient and apply advanced DFO algorithms conveniently. Second, we implement a direct attack algorithm (DFDA) using the Nevergrad library based on the framework. Additionally, we overcome the problem of large search space by redesigning the perturbation vector using constraint size. Finally, we conducted a series of experiments on different datasets and parameters. The results show that DFDA outperforms Nettack in most cases, and it can achieve an average attack success rate of more than 95% on the Cora dataset when perturbing at most eight edges. This demonstrates that our framework can fully exploit the potential of DFO methods in node classification adversarial attacks.

Download Full-text

Optimal Power Flow Using Genetic Algorithm

10.21528/cbic2021-144 ◽

2021 ◽

Author(s):

Fernando Buzzulini Prioste

Keyword(s):

Genetic Algorithm ◽

Power Flow ◽

Optimal Power Flow ◽

Optimization Technique ◽

Search Space ◽

Test System ◽

Optimal Power ◽

Derivative Free Optimization ◽

Derivative Free ◽

Parameter Search Space

This paper presents a genetic algorithm (GA) to solve Optimal Power Flow (OPF) problems, optimizing electricity generation fuel cost. The GA based OPF is a derivative free optimization technique that relies on the evaluation of several points in the parameter search space strictly on the objective function. A 3 bus system and the IEEE 30 bus test system are used to validate the developed GA based OPF by means of comparisons with an interior point based optimal power flow.

Download Full-text

An empirical study of derivative-free-optimization algorithms for targeted black-box attacks in deep neural networks

Optimization and Engineering ◽

10.1007/s11081-021-09652-w ◽

2021 ◽

Author(s):

Giuseppe Ughi ◽

Vinayak Abrol ◽

Jared Tanner

Keyword(s):

Deep Neural Network ◽

Deep Neural Networks ◽

State Of The Art ◽

Black Box ◽

Derivative Free Optimization ◽

Derivative Free ◽

Perturbation Energy ◽

Adversarial Example ◽

New Algorithms ◽

Comprehensive Study

AbstractWe perform a comprehensive study on the performance of derivative free optimization (DFO) algorithms for the generation of targeted black-box adversarial attacks on Deep Neural Network (DNN) classifiers assuming the perturbation energy is bounded by an $$\ell _\infty$$ ℓ ∞ constraint and the number of queries to the network is limited. This paper considers four pre-existing state-of-the-art DFO-based algorithms along with a further developed algorithm built on BOBYQA, a model-based DFO method. We compare these algorithms in a variety of settings according to the fraction of images that they successfully misclassify given a maximum number of queries to the DNN. The experiments disclose how the likelihood of finding an adversarial example depends on both the algorithm used and the setting of the attack; algorithms limiting the search of adversarial example to the vertices of the $$\ell ^\infty$$ ℓ ∞ constraint work particularly well without structural defenses, while the presented BOBYQA based algorithm works better for especially small perturbation energies. This variance in performance highlights the importance of new algorithms being compared to the state-of-the-art in a variety of settings, and the effectiveness of adversarial defenses being tested using as wide a range of algorithms as possible.

Download Full-text

A Fast Method for Fuzzy Rules Learning with Derivative-Free Optimization by Formulating Independent Evaluations of Each Fuzzy Rule

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2021.p0213 ◽

2021 ◽

Vol 25 (2) ◽

pp. 213-225

Author(s):

Kiyohiko Uehara ◽

Kaoru Hirota ◽

◽

Keyword(s):

Fuzzy Rule ◽

Optimization Methods ◽

Search Space ◽

Fuzzy Rules ◽

Fast Method ◽

Evaluation Functions ◽

Derivative Free Optimization ◽

Derivative Free ◽

Effective Use ◽

Learning Data

A method is proposed for evaluating fuzzy rules independently of each other in fuzzy rules learning. The proposed method is named α-FUZZI-ES (α-weight-based fuzzy-rule independent evaluations) in this paper. In α-FUZZI-ES, the evaluation value of a fuzzy system is divided out among the fuzzy rules by using the compatibility degrees of the learning data. By the effective use of α-FUZZI-ES, a method for fast fuzzy rules learning is proposed. This is named α-FUZZI-ES learning (α-FUZZI-ES-based fuzzy rules learning) in this paper. α-FUZZI-ES learning is especially effective when evaluation functions are not differentiable and derivative-based optimization methods cannot be applied to fuzzy rules learning. α-FUZZI-ES learning makes it possible to optimize fuzzy rules independently of each other. This property reduces the dimensionality of the search space in finding the optimum fuzzy rules. Thereby, α-FUZZI-ES learning can attain fast convergence in fuzzy rules optimization. Moreover, α-FUZZI-ES learning can be efficiently performed with hardware in parallel to optimize fuzzy rules independently of each other. Numerical results show that α-FUZZI-ES learning is superior to the exemplary conventional scheme in terms of accuracy and convergence speed when the evaluation function is non-differentiable.

Download Full-text

Study on Human Detection System Using Deep Neural Network and Alternative Learning for Autonomous Flying Drones

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.139.149 ◽

2019 ◽

Vol 139 (2) ◽

pp. 149-157

Author(s):

Itaru Nagayama ◽

Wakaki Uehara ◽

Takaya Miyazato

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Detection System ◽

Human Detection ◽

Alternative Learning

Download Full-text

Optimizing Error Rate in Intrusion Detection System Using Artificial Neural Network Algorithm

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i9.102 ◽

2018 ◽

Vol 6 (9) ◽

pp. 152

Author(s):

S. Vijaya Rani ◽

G. N. K. Suresh Babu

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Intrusion Detection ◽

Error Rate ◽

Learning Process ◽

Nearest Neighbor ◽

Detection System ◽

Support Vector ◽

K Nearest Neighbor ◽

Artificial Neural

The illegal hackers penetrate the servers and networks of corporate and financial institutions to gain money and extract vital information. The hacking varies from one computing system to many system. They gain access by sending malicious packets in the network through virus, worms, Trojan horses etc. The hackers scan a network through various tools and collect information of network and host. Hence it is very much essential to detect the attacks as they enter into a network. The methods available for intrusion detection are Naive Bayes, Decision tree, Support Vector Machine, K-Nearest Neighbor, Artificial Neural Networks. A neural network consists of processing units in complex manner and able to store information and make it functional for use. It acts like human brain and takes knowledge from the environment through training and learning process. Many algorithms are available for learning process This work carry out research on analysis of malicious packets and predicting the error rate in detection of injured packets through artificial neural network algorithms.

Download Full-text

Identifying Key Fraud Indicators in the Automobile Insurance Industry Using SQL Server Analysis Services

Studia Universitatis Babe-Bolyai Oeconomica ◽

10.2478/subboec-2019-0009 ◽

2019 ◽

Vol 64 (2) ◽

pp. 53-71

Author(s):

Botond Benedek ◽

Ede László

Keyword(s):

Neural Network ◽

Decision Tree ◽

Naive Bayes ◽

Insurance Industry ◽

Naïve Bayes ◽

Sql Server ◽

Categorical Variables ◽

Automobile Insurance ◽

Price Determination ◽

Mining Tool

Abstract Customer segmentation represents a true challenge in the automobile insurance industry, as datasets are large, multidimensional, unbalanced and it also requires a unique price determination based on the risk profile of the customer. Furthermore, the price determination of an insurance policy or the validity of the compensation claim, in most cases must be an instant decision. Therefore, the purpose of this research is to identify an easily usable data mining tool that is capable to identify key automobile insurance fraud indicators, facilitating the segmentation. In addition, the methods used by the tool, should be based primarily on numerical and categorical variables, as there is no well-functioning text mining tool for Central Eastern European languages. Hence, we decided on the SQL Server Analysis Services (SSAS) tool and to compare the performance of the decision tree, neural network and Naïve Bayes methods. The results suggest that decision tree and neural network are more suitable than Naïve Bayes, however the best conclusion can be drawn if we use the decision tree and neural network together.

Download Full-text