A Comparison of Regularization Techniques in Deep Neural Networks

Ismoilov Nusrat; Sung-Bong Jang

doi:10.3390/sym10110648

A Comparison of Regularization Techniques in Deep Neural Networks

Symmetry ◽

10.3390/sym10110648 ◽

2018 ◽

Vol 10 (11) ◽

pp. 648 ◽

Cited By ~ 6

Author(s):

Ismoilov Nusrat ◽

Sung-Bong Jang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Model ◽

Neural Network Model ◽

Data Augmentation ◽

Training Data ◽

Regularization Techniques ◽

Performance Results ◽

Normalization Scheme ◽

Significant Attention

Artificial neural networks (ANN) have attracted significant attention from researchers because many complex problems can be solved by training them. If enough data are provided during the training process, ANNs are capable of achieving good performance results. However, if training data are not enough, the predefined neural network model suffers from overfitting and underfitting problems. To solve these problems, several regularization techniques have been devised and widely applied to applications and data analysis. However, it is difficult for developers to choose the most suitable scheme for a developing application because there is no information regarding the performance of each scheme. This paper describes comparative research on regularization techniques by evaluating the training and validation errors in a deep neural network model, using a weather dataset. For comparisons, each algorithm was implemented using a recent neural network library of TensorFlow. The experiment results showed that an autoencoder had the worst performance among schemes. When the prediction accuracy was compared, data augmentation and the batch normalization scheme showed better performance than the others.

Download Full-text

Dynamic versus static neural network model for rainfall forecasting at Klang River Basin, Malaysia

Hydrology and Earth System Sciences ◽

10.5194/hess-16-1151-2012 ◽

2012 ◽

Vol 16 (4) ◽

pp. 1151-1169 ◽

Cited By ~ 34

Author(s):

A. El-Shafie ◽

A. Noureldin ◽

M. Taha ◽

A. Hussain ◽

M. Mukhlisin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Model ◽

Neural Network Model ◽

Network Architecture ◽

Rainfall Time Series ◽

Hydrological Process ◽

Multi Layer Perceptron ◽

Rainfall Forecasting ◽

Unseen Data

Abstract. Rainfall is considered as one of the major components of the hydrological process; it takes significant part in evaluating drought and flooding events. Therefore, it is important to have an accurate model for rainfall forecasting. Recently, several data-driven modeling approaches have been investigated to perform such forecasting tasks as multi-layer perceptron neural networks (MLP-NN). In fact, the rainfall time series modeling involves an important temporal dimension. On the other hand, the classical MLP-NN is a static and has a memoryless network architecture that is effective for complex nonlinear static mapping. This research focuses on investigating the potential of introducing a neural network that could address the temporal relationships of the rainfall series. Two different static neural networks and one dynamic neural network, namely the multi-layer perceptron neural network (MLP-NN), radial basis function neural network (RBFNN) and input delay neural network (IDNN), respectively, have been examined in this study. Those models had been developed for the two time horizons for monthly and weekly rainfall forecasting at Klang River, Malaysia. Data collected over 12 yr (1997–2008) on a weekly basis and 22 yr (1987–2008) on a monthly basis were used to develop and examine the performance of the proposed models. Comprehensive comparison analyses were carried out to evaluate the performance of the proposed static and dynamic neural networks. Results showed that the MLP-NN neural network model is able to follow trends of the actual rainfall, however, not very accurately. RBFNN model achieved better accuracy than the MLP-NN model. Moreover, the forecasting accuracy of the IDNN model was better than that of static network during both training and testing stages, which proves a consistent level of accuracy with seen and unseen data.

Download Full-text

Prospects for recurrent neural network models to learn RNA biophysics from high-throughput data

10.1101/227611 ◽

2017 ◽

Author(s):

Michelle J Wu ◽

Johan OL Andreasson ◽

Wipapat Kladwang ◽

William J Greenleaf ◽

Rhiju Das ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Free Energy ◽

Recurrent Neural Network ◽

Neural Network Model ◽

Data Augmentation ◽

Rna Folding ◽

Large Datasets ◽

Turner Model ◽

Rna Complexes

AbstractRNA is a functionally versatile molecule that plays key roles in genetic regulation and in emerging technologies to control biological processes. Computational models of RNA secondary structure are well-developed but often fall short in making quantitative predictions of the behavior of multi-RNA complexes. Recently, large datasets characterizing hundreds of thousands of individual RNA complexes have emerged as rich sources of information about RNA energetics. Meanwhile, advances in machine learning have enabled the training of complex neural networks from large datasets. Here, we assess whether a recurrent neural network model, Ribonet, can learn from high-throughput binding data, using simulation and experimental studies to test model accuracy but also determine if they learned meaningful information about the biophysics of RNA folding. We began by evaluating the model on energetic values predicted by the Turner model to assess whether the neural network could learn a representation that recovered known biophysical principles. First, we trained Ribonet to predict the simulated free energy of an RNA in complex with multiple input RNAs. Our model accurately predicts free energies of new sequences but also shows evidence of having learned base pairing information, as assessed by in silico double mutant analysis. Next, we extended this model to predict the simulated affinity between an arbitrary RNA sequence and a reporter RNA. While these more indirect measurements precluded the learning of basic principles of RNA biophysics, the resulting model achieved sub-kcal/mol accuracy and enabled design of simple RNA input responsive riboswitches with high activation ratios predicted by the Turner model from which the training data were generated. Finally, we compiled and trained on an experimental dataset comprising over 600,000 experimental affinity measurements published on the Eterna open laboratory. Though our tests revealed that the model likely did not learn a physically realistic representation of RNA interactions, it nevertheless achieved good performance of 0.76 kcal/mol on test sets with the application of transfer learning and novel sequence-specific data augmentation strategies. These results suggest that recurrent neural network architectures, despite being naïve to the physics of RNA folding, have the potential to capture complex biophysical information. However, more diverse datasets, ideally involving more direct free energy measurements, may be necessary to train de novo predictive models that are consistent with the fundamentals of RNA biophysics.Author SummaryThe precise design of RNA interactions is essential to gaining greater control over RNA-based biotechnology tools, including designer riboswitches and CRISPR-Cas9 gene editing. However, the classic model for energetics governing these interactions fails to quantitatively predict the behavior of RNA molecules. We developed a recurrent neural network model, Ribonet, to quantitatively predict these values from sequence alone. Using simulated data, we show that this model is able to learn simple base pairing rules, despite having no a priori knowledge about RNA folding encoded in the network architecture. This model also enables design of new switching RNAs that are predicted to be effective by the “ground truth” simulated model. We applied transfer learning to retrain Ribonet using hundreds of thousands of RNA-RNA affinity measurements and demonstrate simple data augmentation techniques that improve model performance. At the same time, data diversity currently available set limits on Ribonet’s accuracy. Recurrent neural networks are a promising tool for modeling nucleic acid biophysics and may enable design of complex RNAs for novel applications.

Download Full-text

Credit Scoring Using Supervised and Unsupervised Neural Networks

Neural Networks in Business ◽

10.4018/978-1-930708-31-0.ch010 ◽

2002 ◽

pp. 154-166 ◽

Cited By ~ 2

Author(s):

David West ◽

Cornelius Muchineuta

Keyword(s):

Neural Network ◽

Neural Networks ◽

Decision Support ◽

Network Model ◽

Neural Network Model ◽

Credit Scoring ◽

Self Organizing Map ◽

Subprime Lending ◽

The Neural Network ◽

Combined Use

Some of the concerns that plague developers of neural network decision support systems include: (a) How do I understand the underlying structure of the problem domain; (b) How can I discover unknown imperfections in the data which might detract from the generalization accuracy of the neural network model; and (c) What variables should I include to obtain the best generalization properties in the neural network model? In this paper we explore the combined use of unsupervised and supervised neural networks to address these concerns. We develop and test a credit-scoring application using a self-organizing map and a multilayered feedforward neural network. The final product is a neural network decision support system that facilitates subprime lending and is flexible and adaptive to the needs of e-commerce applications.

Download Full-text

Estimation of Polypropylene Melt Index through the Application of Stacked Neural Networks

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.187.411 ◽

2011 ◽

Vol 187 ◽

pp. 411-415

Author(s):

Lu Yue Xia ◽

Hai Tian Pan ◽

Meng Fei Zhou ◽

Yi Jun Cai ◽

Xiao Fang Sun

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Model ◽

Neural Network Model ◽

Model Performance ◽

Measurement Interval ◽

Melt Index ◽

Polypropylene Melt ◽

Single Neural Network

Melt index is the most important parameter in determining the polypropylene grade. Since the lack of proper on-line instruments, its measurement interval and delay are both very long. This makes the quality control quite difficult. A modeling approach based on stacked neural networks is proposed to estimation the polypropylene melt index. Single neural network model generalization capability can be significantly improved by using stacked neural networks model. Proper determination of the stacking weights is essential for good stacked neural networks model performance, so determination of appropriate weights for combining individual networks using the criteria about minimization of sum of absolute prediction error is proposed. Application to real industrial data demonstrates that the polypropylene melt index can be successfully estimated using stacked neural networks. The results obtained demonstrate significant improvements in model accuracy, as a result of using stacked neural networks model, compared to using single neural network model.

Download Full-text

An Efficient Deep Learning Approach to Pneumonia Classification in Healthcare

Journal of Healthcare Engineering ◽

10.1155/2019/4180949 ◽

2019 ◽

Vol 2019 ◽

pp. 1-7 ◽

Cited By ~ 44

Author(s):

Okeke Stephen ◽

Mangal Sain ◽

Uchenna Joseph Maduh ◽

Do-Un Jeong

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Network Model ◽

Neural Network Model ◽

Data Augmentation ◽

Classification Performance ◽

Learning Approaches ◽

X Ray ◽

Chest X Ray

This study proposes a convolutional neural network model trained from scratch to classify and detect the presence of pneumonia from a collection of chest X-ray image samples. Unlike other methods that rely solely on transfer learning approaches or traditional handcrafted techniques to achieve a remarkable classification performance, we constructed a convolutional neural network model from scratch to extract features from a given chest X-ray image and classify it to determine if a person is infected with pneumonia. This model could help mitigate the reliability and interpretability challenges often faced when dealing with medical imagery. Unlike other deep learning classification tasks with sufficient image repository, it is difficult to obtain a large amount of pneumonia dataset for this classification task; therefore, we deployed several data augmentation algorithms to improve the validation and classification accuracy of the CNN model and achieved remarkable validation accuracy.

Download Full-text

Exploiting Product Related Review Features for Fake Review Detection

Mathematical Problems in Engineering ◽

10.1155/2016/4935792 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7 ◽

Cited By ~ 12

Author(s):

Chengai Sun ◽

Qiaolin Du ◽

Gang Tian

Keyword(s):

Neural Network ◽

Network Model ◽

Neural Network Model ◽

Real Life ◽

Product Reviews ◽

Academic Communities ◽

Industrial Organizations ◽

The Neural Network ◽

Fake Reviews ◽

Significant Attention

Product reviews are now widely used by individuals for making their decisions. However, due to the purpose of profit, reviewers game the system by posting fake reviews for promoting or demoting the target products. In the past few years, fake review detection has attracted significant attention from both the industrial organizations and academic communities. However, the issue remains to be a challenging problem due to lacking of labelling materials for supervised learning and evaluation. Current works made many attempts to address this problem from the angles of reviewer and review. However, there has been little discussion about the product related review features which is the main focus of our method. This paper proposes a novel convolutional neural network model to integrate the product related review features through a product word composition model. To reduce overfitting and high variance, a bagging model is introduced to bag the neural network model with two efficient classifiers. Experiments on the real-life Amazon review dataset demonstrate the effectiveness of the proposed approach.

Download Full-text

APPLYING NEURAL NETWORKS TO SOFTWARE RELIABILITY ASSESSMENT

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539310003834 ◽

2010 ◽

Vol 17 (04) ◽

pp. 313-329 ◽

Cited By ~ 2

Author(s):

NORMAN SCHNEIDEWIND

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Model ◽

Neural Network Model ◽

Software Reliability ◽

Likelihood Estimation ◽

Time To Failure ◽

The Neural Network ◽

Method Of Maximum Likelihood ◽

Network Method

We adapt concepts from the field of neural networks to assess the reliability of software, employing cumulative failures, reliability, remaining failures, and time to failure metrics. In addition, the risk of not achieving reliability, remaining failure, and time to failure goals are assessed. The purpose of the assessment is to compare a criterion, derived from a neural network model, for estimating the parameters of software reliability metrics, with the method of maximum likelihood estimation. To our surprise the neural network method proved superior for all the reliability metrics that were assessed by virtue of yielding lower prediction error and risk. We also found that considerable adaptation of the neural network model was necessary to be meaningful for our application – only inputs, functions, neurons, weights, activation units, and outputs were required to characterize our application.

Download Full-text

DYNAMIC OUTPUT FEEDBACK STABILIZATION FOR NONLINEAR SYSTEMS BASED ON STANDARD NEURAL NETWORK MODELS

International Journal of Neural Systems ◽

10.1142/s0129065706000706 ◽

2006 ◽

Vol 16 (04) ◽

pp. 305-317 ◽

Cited By ~ 6

Author(s):

MEIQIN LIU

Keyword(s):

Neural Network ◽

Neural Networks ◽

Nonlinear Systems ◽

Network Model ◽

Neural Network Model ◽

Output Feedback ◽

Control Design ◽

Feedback Stabilization ◽

Dynamic Output Feedback ◽

Dynamic Output

A neural-model-based control design for some nonlinear systems is addressed. The design approach is to approximate the nonlinear systems with neural networks of which the activation functions satisfy the sector conditions. A novel neural network model termed standard neural network model (SNNM) is advanced for describing this class of approximating neural networks. Full-order dynamic output feedback control laws are then designed for the SNNMs with inputs and outputs to stabilize the closed-loop systems. The control design equations are shown to be a set of linear matrix inequalities (LMIs) which can be easily solved by various convex optimization algorithms to determine the control signals. It is shown that most neural-network-based nonlinear systems can be transformed into input-output SNNMs to be stabilization synthesized in a unified way. Finally, some application examples are presented to illustrate the control design procedures.

Download Full-text

A NEURAL NETWORK MODEL FOR CREDIT RISK EVALUATION

International Journal of Neural Systems ◽

10.1142/s0129065709002014 ◽

2009 ◽

Vol 19 (04) ◽

pp. 285-294 ◽

Cited By ~ 39

Author(s):

ADNAN KHASHMAN

Keyword(s):

Neural Network ◽

Neural Networks ◽

Credit Risk ◽

Network Model ◽

Neural Network Model ◽

Risk Evaluation ◽

Evaluation System ◽

Learning Algorithm ◽

Back Propagation ◽

Learning Schemes

Credit scoring is one of the key analytical techniques in credit risk evaluation which has been an active research area in financial risk management. This paper presents a credit risk evaluation system that uses a neural network model based on the back propagation learning algorithm. We train and implement the neural network to decide whether to approve or reject a credit application, using seven learning schemes and real world credit applications from the Australian credit approval datasets. A comparison of the system performance under the different learning schemes is provided, furthermore, we compare the performance of two neural networks; with one and two hidden layers following the ideal learning scheme. Experimental results suggest that neural networks can be effectively used in automatic processing of credit applications.

Download Full-text

A Computationally Efficient Methodology for Generating Training Data for a Transient Neural Network of a Tip-Jet Reaction Drive System

Journal of Engineering for Gas Turbines and Power ◽

10.1115/1.4003957 ◽

2011 ◽

Vol 133 (12) ◽

Cited By ~ 6

Author(s):

Brian K. Kestner ◽

Jimmy C.M. Tai ◽

Dimitri N. Mavris

Keyword(s):

Neural Network ◽

Steady State ◽

Network Model ◽

Neural Network Model ◽

Operating Conditions ◽

Drive System ◽

Training Data ◽

Computationally Efficient ◽

Model Based Control ◽

Training Methodology

This paper presents a computationally efficient methodology for generating training data for a transient neural network model of a tip-jet reaction drive system for potential use as an onboard model in a model based control application. This methodology significantly reduces the number of training points required to capture the transient performance of the system. The challenge in developing an onboard model for a tip-jet reaction drive system is that the model has to operate over the whole flight envelope, to account for the different dynamics present in the system, and to adjust to system degradation or potential faults. In addition, the onboard model must execute in less time than the update interval of the controller. To address these issues, a computationally efficient training methodology and neural network surrogate model have been developed that captures the transient performance of the tip-jet reaction system. As the number of inputs to a neural network becomes large, the computational time needed to generate the number of training points required to accurately represent the range of operating conditions of the system may become quite large also. A challenge for the tip-jet reaction drive system is to minimize the number of neural network training points, while maintaining the high accuracy. To address this issue, a novel training methodology is presented which first trains a steady-state neural network model and uses deviations from steady-state operating conditions to define the transient portion of the training data. The combined results from both the transient and the steady-state training data can then be used to create a single transient neural network of the system. The results in this paper demonstrate that a transient neural network using this new computationally efficient training methodology has the potential to be a feasible option for use as an onboard real-time model for model based control of a tip-jet reaction drive system.

Download Full-text