Lagrangian dual framework for conservative neural network solutions of kinetic equations

Hyung Ju Hwang; Hwijae Son

doi:10.3934/krm.2021046

Lagrangian dual framework for conservative neural network solutions of kinetic equations

Kinetic and Related Models ◽

10.3934/krm.2021046 ◽

2021 ◽

Vol 0 (0) ◽

pp. 0

Author(s):

Hyung Ju Hwang ◽

Hwijae Son

Keyword(s):

Neural Network ◽

Neural Networks ◽

Conservation Laws ◽

Optimization Problem ◽

Kinetic Equations ◽

Planck Equation ◽

Constrained Optimization Problem ◽

Learning Problem ◽

Homogeneous Boltzmann Equation ◽

Lagrangian Dual

<p style='text-indent:20px;'>In this paper, we propose a novel conservative formulation for solving kinetic equations via neural networks. More precisely, we formulate the learning problem as a constrained optimization problem with constraints that represent the physical conservation laws. The constraints are relaxed toward the residual loss function by the Lagrangian duality. By imposing physical conservation properties of the solution as constraints of the learning problem, we demonstrate far more accurate approximations of the solutions in terms of errors and the conservation laws, for the kinetic Fokker-Planck equation and the homogeneous Boltzmann equation.</p>

Download Full-text

Discretization and machine learning approximation of BSDEs with a constraint on the Gains-process

Monte Carlo Methods and Applications ◽

10.1515/mcma-2020-2080 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Idris Kharroubi ◽

Thomas Lim ◽

Xavier Warin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Differential Equations ◽

Numerical Experiments ◽

Optimization Problem ◽

Learning Approach ◽

The Neural Network ◽

Machine Learning Approach ◽

Mesh Grid

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.

Download Full-text

An Interactive Neural Network for Constrained Multi-Objective Optimization with Application to the Design of Digital Filters

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.2808 ◽

2012 ◽

Vol 433-440 ◽

pp. 2808-2816

Author(s):

Jian Jin Zheng ◽

You Shen Xia

Keyword(s):

Neural Network ◽

Neural Networks ◽

Computational Complexity ◽

Digital Filters ◽

Optimization Problem ◽

Optimization Problems ◽

Optimal Solution ◽

Multi Objective Optimization ◽

Multi Objective ◽

Single Objective

This paper presents a new interactive neural network for solving constrained multi-objective optimization problems. The constrained multi-objective optimization problem is reformulated into two constrained single objective optimization problems and two neural networks are designed to obtain the optimal weight and the optimal solution of the two optimization problems respectively. The proposed algorithm has a low computational complexity and is easy to be implemented. Moreover, the proposed algorithm is well applied to the design of digital filters. Computed results illustrate the good performance of the proposed algorithm.

Download Full-text

Brick Assembly Networks: An Effective Network for Incremental Learning Problems

Electronics ◽

10.3390/electronics9111929 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1929

Author(s):

Jiacang Ho ◽

Dae-Ki Kang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Language Processing ◽

Loss Function ◽

Incremental Learning ◽

Network Architecture ◽

High Performance ◽

Learning Problem ◽

Output Layer ◽

Trained Network

Deep neural networks have achieved high performance in image classification, image generation, voice recognition, natural language processing, etc.; however, they still have confronted several open challenges that need to be solved such as incremental learning problem, overfitting in neural networks, hyperparameter optimization, lack of flexibility and multitasking, etc. In this paper, we focus on the incremental learning problem which is related with machine learning methodologies that continuously train an existing model with additional knowledge. To the best of our knowledge, a simple and direct solution to solve this challenge is to retrain the entire neural network after adding the new labels in the output layer. Besides that, transfer learning can be applied only if the domain of the new labels is related to the domain of the labels that have already been trained in the neural network. In this paper, we propose a novel network architecture, namely Brick Assembly Network (BAN), which allows a trained network to assemble (or dismantle) a new label to (or from) a trained neural network without retraining the entire network. In BAN, we train labels with a sub-network (i.e., a simple neural network) individually and then we assemble the converged sub-networks that have trained for a single label together to form a full neural network. For each label to be trained in a sub-network of BAN, we introduce a new loss function that minimizes the loss of the network with only one class data. Applying one loss function for each class label is unique and different from standard neural network architectures (e.g., AlexNet, ResNet, InceptionV3, etc.) which use the values of a loss function from multiple labels to minimize the error of the network. The difference of between the loss functions of previous approaches and the one we have introduced is that we compute a loss values from node values of penultimate layer (we named it as a characteristic layer) instead of the output layer where the computation of the loss values occurs between true labels and predicted labels. From the experiment results on several benchmark datasets, we evaluate that BAN shows a strong capability of adding (and removing) a new label to a trained network compared with a standard neural network and other previous work.

Download Full-text

A Model Solving Constrained Optimization Problem Based on the Stability of Hopfield Neural Network

2006 6th World Congress on Intelligent Control and Automation ◽

10.1109/wcica.2006.1712873 ◽

2006 ◽

Author(s):

Xiaochen Hao ◽

Haibin Gao ◽

Chao Sun ◽

Bin Liu

Keyword(s):

Neural Network ◽

Constrained Optimization ◽

Optimization Problem ◽

Hopfield Neural Network ◽

Constrained Optimization Problem ◽

The Stability

Download Full-text

A homotopy training algorithm for fully connected neural networks

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2019.0662 ◽

2019 ◽

Vol 475 (2231) ◽

pp. 20190662

Author(s):

Qipin Chen ◽

Wenrui Hao

Keyword(s):

Neural Network ◽

Neural Networks ◽

Optimization Problem ◽

Global Minimum ◽

Optimization Problems ◽

Continuous Path ◽

Training Algorithm ◽

Test Dataset ◽

The Neural Network ◽

Fully Connected

In this paper, we present a homotopy training algorithm (HTA) to solve optimization problems arising from fully connected neural networks with complicated structures. The HTA dynamically builds the neural network starting from a simplified version and ending with the fully connected network via adding layers and nodes adaptively. Therefore, the corresponding optimization problem is easy to solve at the beginning and connects to the original model via a continuous path guided by the HTA, which provides a high probability of obtaining a global minimum. By gradually increasing the complexity of the model along the continuous path, the HTA provides a rather good solution to the original loss function. This is confirmed by various numerical results including VGG models on CIFAR-10. For example, on the VGG13 model with batch normalization, HTA reduces the error rate by 11.86% on the test dataset compared with the traditional method. Moreover, the HTA also allows us to find the optimal structure for a fully connected neural network by building the neutral network adaptively.

Download Full-text

FAST LEARNING OF BIASED PATTERNS IN NEURAL NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065793000183 ◽

1993 ◽

Vol 04 (03) ◽

pp. 223-230 ◽

Cited By ~ 3

Author(s):

A. WENDEMUTH ◽

D. SHERRINGTON

Keyword(s):

Neural Network ◽

Neural Networks ◽

Gradient Descent ◽

Inverse Solution ◽

Training Algorithms ◽

Learning Problem ◽

Fast Learning ◽

Optimal Stability ◽

Convergence Proofs ◽

Pseudo Inverse

Usual neural network gradient descent training algorithms require training times of the same order as the number of neurons N if the patterns are biased. In this paper, modified algorithms are presented which require training times equal to those in unbiased cases which are of order 1. Exact convergence proofs are given. Gain parameters which produce minimal learning times in large networks are computed by replica methods. It is demonstrated how these modified algorithms are applied in order to produce four types of solutions to the learning problem: 1. a solution with all internal fields equal to the desired output, 2. the Adaline (or pseudo-inverse) solution, 3. the perceptron of optimal stability without threshold and 4. the perceptron of optimal stability with threshold.

Download Full-text

Modified Courant-Beltrami penalty function and a duality gap for invex optimization problem

International Journal for Simulation and Multidisciplinary Design Optimization ◽

10.1051/smdo/2019010 ◽

2019 ◽

Vol 10 ◽

pp. A10

Author(s):

Mansur Hassan ◽

Adam Baharum

Keyword(s):

Penalty Function ◽

Optimization Problem ◽

Duality Gap ◽

Penalty Function Method ◽

Constrained Optimization Problem ◽

Zero Duality Gap ◽

Invex Functions ◽

Lagrangian Dual ◽

Nonlinear Mathematical Programming ◽

Mathematical Programming Problems

In this paper, we modified a Courant-Beltrami penalty function method for constrained optimization problem to study a duality for convex nonlinear mathematical programming problems. Karush-Kuhn-Tucker (KKT) optimality conditions for the penalized problem has been used to derived KKT multiplier based on the imposed additional hypotheses on the constraint function g. A zero-duality gap between an optimization problem constituted by invex functions with respect to the same function η and their Lagrangian dual problems has also been established. The examples have been provided to illustrate and proved the result for the broader class of convex functions, termed invex functions.

Download Full-text

SCORING MODELING BASED ON NEURAL NETWORKS FOR DETERMINING A BANK BORROWER'S RATING

Economy of Ukraine ◽

10.15407/economyukr.2020.10.054 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 54-62

Author(s):

Oleksii VASYLIEV ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Statistical Data ◽

Activation Function ◽

Decision Making Process ◽

Neural Network Architecture ◽

Acceptable Accuracy ◽

The Neural Network ◽

Sigmoid Activation Function

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.

Download Full-text

Color Space Transformation using Neural Networks

Color and Imaging Conference ◽

10.2352/issn.2169-2629.2019.27.29 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 153-158

Author(s):

Lindsay MacDonald

Keyword(s):

Neural Network ◽

Neural Networks ◽

Color Space ◽

Reflectance Spectra ◽

Network Architectures ◽

Color Spaces ◽

Natural Materials ◽

Space Transformation ◽

Color Space Transformation

We investigated how well a multilayer neural network could implement the mapping between two trichromatic color spaces, specifically from camera R,G,B to tristimulus X,Y,Z. For training the network, a set of 800,000 synthetic reflectance spectra was generated. For testing the network, a set of 8,714 real reflectance spectra was collated from instrumental measurements on textiles, paints and natural materials. Various network architectures were tested, with both linear and sigmoidal activations. Results show that over 85% of all test samples had color errors of less than 1.0 ΔE2000 units, much more accurate than could be achieved by regression.

Download Full-text

Estimating Pigment Concentrations from Spectral Images Using an Encoder‐Decoder Neural Network

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2020.64.3.030502 ◽

2020 ◽

Vol 64 (3) ◽

pp. 30502-1-30502-15

Author(s):

Kensuke Fukumoto ◽

Norimichi Tsumura ◽

Roy Berns

Keyword(s):

Neural Network ◽

Neural Networks ◽

Absorption Coefficient ◽

Spectral Data ◽

High Accuracy ◽

Pigment Concentration ◽

Scattering Coefficient ◽

A Value ◽

Input And Output ◽

Pigment Concentrations

Abstract A method is proposed to estimate the concentration of pigments mixed in a painting, using the encoder‐decoder model of neural networks. The model is trained to output a value that is the same as its input, and its middle output extracts a certain feature as compressed information about the input. In this instance, the input and output are spectral data of a painting. The model is trained with pigment concentration as the middle output. A dataset containing the scattering coefficient and absorption coefficient of each of 19 pigments was used. The Kubelka‐Munk theory was applied to the coefficients to obtain many patterns of synthetic spectral data, which were used for training. The proposed method was tested using spectral images of 33 paintings, which showed that the method estimates, with high accuracy, the concentrations that have a similar spectrum of the target pigments.

Download Full-text