approximate hessian Latest Research Papers

Randomized Simplicial Hessian Update

Mathematics ◽

10.3390/math9151775 ◽

2021 ◽

Vol 9 (15) ◽

pp. 1775

Author(s):

Árpád Bűrmen ◽

Tadej Tuma ◽

Jernej Olenšek

Keyword(s):

Lower Bound ◽

Dimensional Subspace ◽

Expected Improvement ◽

Closed Form Expression ◽

Frobenius Norm ◽

Derivative Free Optimization ◽

Derivative Free ◽

Derivative Information ◽

Approximate Hessian ◽

Special Case

Recently, a derivative-free optimization algorithm was proposed that utilizes a minimum Frobenius norm (MFN) Hessian update for estimating the second derivative information, which in turn is used for accelerating the search. The proposed update formula relies only on computed function values and is a closed-form expression for a special case of a more general approach first published by Powell. This paper analyzes the convergence of the proposed update formula under the assumption that the points from Rn where the function value is known are random. The analysis assumes that the N+2 points used by the update formula are obtained by adding N+1 vectors to a central point. The vectors are obtained by transforming a prototype set of N+1 vectors with a random orthogonal matrix from the Haar measure. The prototype set must positively span a N≤n dimensional subspace. Because the update is random by nature we can estimate a lower bound on the expected improvement of the approximate Hessian. This lower bound was derived for a special case of the proposed update by Leventhal and Lewis. We generalize their result and show that the amount of improvement greatly depends on N as well as the choice of the vectors in the prototype set. The obtained result is then used for analyzing the performance of the update based on various commonly used prototype sets. One of the results obtained by this analysis states that a regular n-simplex is a bad choice for a prototype set because it does not guarantee any improvement of the approximate Hessian.

Download Full-text

Elastic Full-Waveform Inversion Using Both the Multiparametric Approximate Hessian and the Discrete Cosine Transform

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3101193 ◽

2021 ◽

pp. 1-10

Author(s):

Dawoon Lee ◽

Jonghyun Lee ◽

Changsoo Shin ◽

Sungryul Shin ◽

Wookeen Chung

Keyword(s):

Discrete Cosine Transform ◽

Waveform Inversion ◽

Full Waveform Inversion ◽

Cosine Transform ◽

Full Waveform ◽

Approximate Hessian

Download Full-text

Use of prismatic waves in full-waveform inversion with the exact Hessian

Geophysics ◽

10.1190/geo2018-0625.1 ◽

2020 ◽

Vol 85 (4) ◽

pp. R325-R337 ◽

Cited By ~ 1

Author(s):

Yuzhu Liu ◽

Zheng Wu ◽

Hao Kang ◽

Jizhong Yang

Keyword(s):

Waveform Inversion ◽

Single Scattering ◽

Full Waveform Inversion ◽

Descent Direction ◽

Numerical Examples ◽

Full Waveform ◽

Truncated Newton Method ◽

Truncated Newton ◽

Approximate Hessian

The truncated Newton method uses information contained in the exact Hessian in full-waveform inversion (FWI). The exact Hessian physically contains information regarding doubly scattered waves, especially prismatic events. These waves are mainly caused by the scattering at steeply dipping structures, such as salt flanks and vertical or nearly vertical faults. We have systematically investigated the properties and applications of the exact Hessian. We begin by giving the formulas for computing each term in the exact Hessian and numerically analyzing their characteristics. We show that the second term in the exact Hessian may be comparable in magnitude to the first term. In particular, when there are apparent doubly scattered waves in the observed data, the influence of the second term may be dominant in the exact Hessian and the second term cannot be neglected. Next, we adopt a migration/demigration approach to compute the Gauss-Newton-descent direction and the Newton-descent direction using the approximate Hessian and the exact Hessian, respectively. In addition, we determine from the forward and the inverse perspectives that the second term in the exact Hessian not only contributes to the use of doubly scattered waves, but it also compensates for the use of single-scattering waves in FWI. Finally, we use three numerical examples to prove that by considering the second term in the exact Hessian, the role of prismatic waves in the observed data can be effectively revealed and steeply dipping structures can be reconstructed with higher accuracy.

Download Full-text

Do Subsampled Newton Methods Work for High-Dimensional Data?

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5905 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4723-4730

Author(s):

Xiang Li ◽

Shusen Wang ◽

Zhihua Zhang

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Empirical Risk Minimization ◽

Newton Methods ◽

Risk Minimization ◽

Strongly Convex ◽

Empirical Risk ◽

Data Points ◽

Approximate Hessian

Subsampled Newton methods approximate Hessian matrices through subsampling techniques to alleviate the per-iteration cost. Previous results require Ω (d) samples to approximate Hessians, where d is the dimension of data points, making it less practical for high-dimensional data. The situation is deteriorated when d is comparably as large as the number of data points n, which requires to take the whole dataset into account, making subsampling not useful. This paper theoretically justifies the effectiveness of subsampled Newton methods on strongly convex empirical risk minimization with high dimensional data. Specifically, we provably require only Θ˜(deffγ) samples for approximating the Hessian matrices, where deffγ is the γ-ridge leverage and can be much smaller than d as long as nγ ≫ 1. Our theories work for three types of Newton methods: subsampled Netwon, distributed Newton, and proximal Newton.

Download Full-text

A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization

Symmetry ◽

10.3390/sym12020208 ◽

2020 ◽

Vol 12 (2) ◽

pp. 208 ◽

Cited By ~ 2

Author(s):

Xinyi Wang ◽

Xianfeng Ding ◽

Quan Qu

Keyword(s):

Unconstrained Optimization ◽

Symmetric Matrix ◽

Hessian Matrix ◽

Trust Region Method ◽

Trust Region ◽

Adaptive Strategy ◽

Step Length ◽

Computational Costs ◽

Definite Symmetric Matrix ◽

Approximate Hessian

In this paper, a new filter nonmonotone adaptive trust region with fixed step length for unconstrained optimization is proposed. The trust region radius adopts a new adaptive strategy to overcome additional computational costs at each iteration. A new nonmonotone trust region ratio is introduced. When a trial step is not successful, a multidimensional filter is employed to increase the possibility of the trial step being accepted. If the trial step is still not accepted by the filter set, it is possible to find a new iteration point along the trial step and the step length is computed by a fixed formula. The positive definite symmetric matrix of the approximate Hessian matrix is updated using the MBFGS method. The global convergence and superlinear convergence of the proposed algorithm is proven by some classical assumptions. The efficiency of the algorithm is tested by numerical results.

Download Full-text

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

10.26434/chemrxiv.9750512 ◽

2019 ◽

Author(s):

Eric Hermes ◽

Khachik Sargsyan ◽

Habib Najm ◽

Judit Zádor

Keyword(s):

Saddle Point ◽

Energy Surface ◽

Hessian Matrix ◽

Chemical Systems ◽

First Order ◽

Computational Bottleneck ◽

The Cost ◽

Curvature Information ◽

Approximate Hessian

Identification and refinement of first order saddle point (FOSP) structures on the potential energy surface (PES) of chemical systems is a computational bottleneck in the characterization of reaction pathways. Leading FOSP refinement strategies require calculation of the full Hessian matrix, which is not feasible for larger systems such as those encountered in heterogeneous catalysis. For these systems, the standard approach to FOSP refinement involves iterative diagonalization of the Hessian, but this comes at the cost of longer refinement trajectories due to the lack of accurate curvature information. We present a method for incorporating information obtained by an iterative diagonalization algorithm into the construction of an approximate Hessian matrix that accelerates FOSP refinement. We measure the performance of our method with two established FOSP refinement benchmarks and find a 50% reduction on average in the number of gradient evaluations required to converge to a FOSP for one benchmark, and a 25% reduction on average for the second benchmark.

Download Full-text

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

10.26434/chemrxiv.9750512.v1 ◽

2019 ◽

Author(s):

Eric Hermes ◽

Khachik Sargsyan ◽

Habib Najm ◽

Judit Zádor

Keyword(s):

Saddle Point ◽

Energy Surface ◽

Hessian Matrix ◽

Chemical Systems ◽

First Order ◽

Computational Bottleneck ◽

The Cost ◽

Curvature Information ◽

Approximate Hessian

Identification and refinement of first order saddle point (FOSP) structures on the potential energy surface (PES) of chemical systems is a computational bottleneck in the characterization of reaction pathways. Leading FOSP refinement strategies require calculation of the full Hessian matrix, which is not feasible for larger systems such as those encountered in heterogeneous catalysis. For these systems, the standard approach to FOSP refinement involves iterative diagonalization of the Hessian, but this comes at the cost of longer refinement trajectories due to the lack of accurate curvature information. We present a method for incorporating information obtained by an iterative diagonalization algorithm into the construction of an approximate Hessian matrix that accelerates FOSP refinement. We measure the performance of our method with two established FOSP refinement benchmarks and find a 50% reduction on average in the number of gradient evaluations required to converge to a FOSP for one benchmark, and a 25% reduction on average for the second benchmark.

Download Full-text

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013337 ◽

2019 ◽

Vol 33 ◽

pp. 3337-3346

Author(s):

Sheng-Wei Chen ◽

Chun-Nan Chou ◽

Edward Y. Chang

Keyword(s):

Neural Networks ◽

Empirical Studies ◽

Hessian Matrix ◽

Coefficient Matrix ◽

Second Order ◽

Order Method ◽

Criterion Functions ◽

Rank Approximation ◽

Fully Connected ◽

Approximate Hessian

For training fully-connected neural networks (FCNNs), we propose a practical approximate second-order method including: 1) an approximation of the Hessian matrix and 2) a conjugate gradient (CG) based method. Our proposed approximate Hessian matrix is memory-efficient and can be applied to any FCNNs where the activation and criterion functions are twice differentiable. We devise a CG-based method incorporating one-rank approximation to derive Newton directions for training FCNNs, which significantly reduces both space and time complexity. This CG-based method can be employed to solve any linear equation where the coefficient matrix is Kroneckerfactored, symmetric and positive definite. Empirical studies show the efficacy and efficiency of our proposed method.

Download Full-text

Approximate Hessian for accelerated convergence of aerodynamic shape optimization problems in an adjoint-based framework

Computers & Fluids ◽

10.1016/j.compfluid.2018.04.019 ◽

2018 ◽

Vol 168 ◽

pp. 265-284 ◽

Cited By ~ 1

Author(s):

Doug Shi-Dong ◽

Siva Nadarajah

Keyword(s):

Shape Optimization ◽

Optimization Problems ◽

Aerodynamic Shape Optimization ◽

Aerodynamic Shape ◽

Accelerated Convergence ◽

Approximate Hessian

Download Full-text

Adaptive Learning Rate via Covariance Matrix Based Preconditioning for Deep Neural Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/267 ◽

2017 ◽

Cited By ~ 3

Author(s):

Yasutoshi Ida ◽

Yasuhiro Fujiwara ◽

Sotetsu Iwamura

Keyword(s):

Neural Networks ◽

Stochastic Optimization ◽

Covariance Matrix ◽

Adaptive Learning ◽

Deep Neural Networks ◽

Learning Rate ◽

First Order ◽

Adaptive Learning Rate ◽

Approximate Hessian

Adaptive learning rate algorithms such as RMSProp are widely used for training deep neural networks. RMSProp offers efficient training since it uses first order gradients to approximate Hessian-based preconditioning. However, since the first order gradients include noise caused by stochastic optimization, the approximation may be inaccurate. In this paper, we propose a novel adaptive learning rate algorithm called SDProp. Its key idea is effective handling of the noise by preconditioning based on covariance matrix. For various neural networks, our approach is more efficient and effective than RMSProp and its variant.

Download Full-text

approximate hessian
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Randomized Simplicial Hessian Update

Elastic Full-Waveform Inversion Using Both the Multiparametric Approximate Hessian and the Discrete Cosine Transform

Use of prismatic waves in full-waveform inversion with the exact Hessian

Do Subsampled Newton Methods Work for High-Dimensional Data?

A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Approximate Hessian for accelerated convergence of aerodynamic shape optimization problems in an adjoint-based framework

Adaptive Learning Rate via Covariance Matrix Based Preconditioning for Deep Neural Networks

Export Citation Format

approximate hessianRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Randomized Simplicial Hessian Update

Elastic Full-Waveform Inversion Using Both the Multiparametric Approximate Hessian and the Discrete Cosine Transform

Use of prismatic waves in full-waveform inversion with the exact Hessian

Do Subsampled Newton Methods Work for High-Dimensional Data?

A New Filter Nonmonotone Adaptive Trust Region Method for Unconstrained Optimization

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

Accelerated Saddle Point Refinement Through Full Exploitation of Partial Hessian Diagonalization

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Approximate Hessian for accelerated convergence of aerodynamic shape optimization problems in an adjoint-based framework

Adaptive Learning Rate via Covariance Matrix Based Preconditioning for Deep Neural Networks

approximate hessian
Recently Published Documents