Convergence rates of Quasi-Newton algorithms for some non-smooth optimization problems

1983 ◽  
Author(s):  
Ekkehard Sachs
2021 ◽  
Vol 11 (8) ◽  
pp. 3430
Author(s):  
Erik Cuevas ◽  
Héctor Becerra ◽  
Héctor Escobar ◽  
Alberto Luque-Chang ◽  
Marco Pérez ◽  
...  

Recently, several new metaheuristic schemes have been introduced in the literature. Although all these approaches consider very different phenomena as metaphors, the search patterns used to explore the search space are very similar. On the other hand, second-order systems are models that present different temporal behaviors depending on the value of their parameters. Such temporal behaviors can be conceived as search patterns with multiple behaviors and simple configurations. In this paper, a set of new search patterns are introduced to explore the search space efficiently. They emulate the response of a second-order system. The proposed set of search patterns have been integrated as a complete search strategy, called Second-Order Algorithm (SOA), to obtain the global solution of complex optimization problems. To analyze the performance of the proposed scheme, it has been compared in a set of representative optimization problems, including multimodal, unimodal, and hybrid benchmark formulations. Numerical results demonstrate that the proposed SOA method exhibits remarkable performance in terms of accuracy and high convergence rates.


Author(s):  
Jie Guo ◽  
Zhong Wan

A new spectral three-term conjugate gradient algorithm in virtue of the Quasi-Newton equation is developed for solving large-scale unconstrained optimization problems. It is proved that the search directions in this algorithm always satisfy a sufficiently descent condition independent of any line search. Global convergence is established for general objective functions if the strong Wolfe line search is used. Numerical experiments are employed to show its high numerical performance in solving large-scale optimization problems. Particularly, the developed algorithm is implemented to solve the 100 benchmark test problems from CUTE with different sizes from 1000 to 10,000, in comparison with some similar ones in the literature. The numerical results demonstrate that our algorithm outperforms the state-of-the-art ones in terms of less CPU time, less number of iteration or less number of function evaluation.


2021 ◽  
Author(s):  
Faruk Alpak ◽  
Yixuan Wang ◽  
Guohua Gao ◽  
Vivek Jain

Abstract Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir performance optimization problems including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to effectively locate multiple local optima of highly nonlinear optimization problems. However, its performance has neither been validated by realistic applications nor compared to other DFO methods. We have integrated DQN into a versatile field-development optimization platform designed specifically for iterative workflows enabled through distributed-parallel flow simulations. DQN is benchmarked against alternative DFO techniques, namely, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method hybridized with Direct Pattern Search (BFGS-DPS), Mesh Adaptive Direct Search (MADS), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). DQN is a multi-thread optimization method that distributes an ensemble of optimization tasks among multiple high-performance-computing nodes. Thus, it can locate multiple optima of the objective function in parallel within a single run. Simulation results computed from one DQN optimization thread are shared with others by updating a unified set of training data points composed of responses (implicit variables) of all successful simulation jobs. The sensitivity matrix at the current best solution of each optimization thread is approximated by a linear-interpolation technique using all or a subset of training-data points. The gradient of the objective function is analytically computed using the estimated sensitivities of implicit variables with respect to explicit variables. The Hessian matrix is then updated using the quasi-Newton method. A new search point for each thread is solved from a trust-region subproblem for the next iteration. In contrast, other DFO methods rely on a single-thread optimization paradigm that can only locate a single optimum. To locate multiple optima, one must repeat the same optimization process multiple times starting from different initial guesses for such methods. Moreover, simulation results generated from a single-thread optimization task cannot be shared with other tasks. Benchmarking results are presented for synthetic yet challenging WLO and WCO problems. Finally, DQN method is field-tested on two realistic applications. DQN identifies the global optimum with the least number of simulations and the shortest run time on a synthetic problem with known solution. On other benchmarking problems without a known solution, DQN identified compatible local optima with reasonably smaller numbers of simulations compared to alternative techniques. Field-testing results reinforce the auspicious computational attributes of DQN. Overall, the results indicate that DQN is a novel and effective parallel algorithm for field-scale development optimization problems.


2021 ◽  
Vol Volume 2 (Original research articles>) ◽  
Author(s):  
Lisa C. Hegerhorst-Schultchen ◽  
Christian Kirches ◽  
Marc C. Steinbach

This work continues an ongoing effort to compare non-smooth optimization problems in abs-normal form to Mathematical Programs with Complementarity Constraints (MPCCs). We study general Nonlinear Programs with equality and inequality constraints in abs-normal form, so-called Abs-Normal NLPs, and their relation to equivalent MPCC reformulations. We introduce the concepts of Abadie's and Guignard's kink qualification and prove relations to MPCC-ACQ and MPCC-GCQ for the counterpart MPCC formulations. Due to non-uniqueness of a specific slack reformulation suggested in [10], the relations are non-trivial. It turns out that constraint qualifications of Abadie type are preserved. We also prove the weaker result that equivalence of Guginard's (and Abadie's) constraint qualifications for all branch problems hold, while the question of GCQ preservation remains open. Finally, we introduce M-stationarity and B-stationarity concepts for abs-normal NLPs and prove first order optimality conditions corresponding to MPCC counterpart formulations.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


SPE Journal ◽  
2021 ◽  
pp. 1-17
Author(s):  
Yixuan Wang ◽  
Faruk Alpak ◽  
Guohua Gao ◽  
Chaohui Chen ◽  
Jeroen Vink ◽  
...  

Summary Although it is possible to apply traditional optimization algorithms to determine the Pareto front of a multiobjective optimization problem, the computational cost is extremely high when the objective function evaluation requires solving a complex reservoir simulation problem and optimization cannot benefit from adjoint-based gradients. This paper proposes a novel workflow to solve bi-objective optimization problems using the distributed quasi-Newton (DQN) method, which is a well-parallelized and derivative-free optimization (DFO) method. Numerical tests confirm that the DQN method performs efficiently and robustly. The efficiency of the DQN optimizer stems from a distributed computing mechanism that effectively shares the available information discovered in prior iterations. Rather than performing multiple quasi-Newton optimization tasks in isolation, simulation results are shared among distinct DQN optimization tasks or threads. In this paper, the DQN method is applied to the optimization of a weighted average of two objectives, using different weighting factors for different optimization threads. In each iteration, the DQN optimizer generates an ensemble of search points (or simulation cases) in parallel, and a set of nondominated points is updated accordingly. Different DQN optimization threads, which use the same set of simulation results but different weighting factors in their objective functions, converge to different optima of the weighted average objective function. The nondominated points found in the last iteration form a set of Pareto-optimal solutions. Robustness as well as efficiency of the DQN optimizer originates from reliance on a large, shared set of intermediate search points. On the one hand, this set of searching points is (much) smaller than the combined sets needed if all optimizations with different weighting factors would be executed separately; on the other hand, the size of this set produces a high fault tolerance, which means even if some simulations fail at a given iteration, the DQN method’s distributed-parallelinformation-sharing protocol is designed and implemented such that the optimization process can still proceed to the next iteration. The proposed DQN optimization method is first validated on synthetic examples with analytical objective functions. Then, it is tested on well-location optimization (WLO) problems by maximizing the oil production and minimizing the water production. Furthermore, the proposed method is benchmarked against a bi-objective implementation of the mesh adaptive direct search (MADS) method, and the numerical results reinforce the auspicious computational attributes of DQN observed for the test problems. To the best of our knowledge, this is the first time that a well-parallelized and derivative-free DQN optimization method has been developed and tested on bi-objective optimization problems. The methodology proposed can help improve efficiency and robustness in solving complicated bi-objective optimization problems by taking advantage of model-based search algorithms with an effective information-sharing mechanism. NOTE: This paper is published as part of the 2021 SPE Reservoir Simulation Conference Special Issue.


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Pengyuan Li ◽  
Zhan Wang ◽  
Dan Luo ◽  
Hongtruong Pham

The BFGS method is one of the most efficient quasi-Newton methods for solving small- and medium-size unconstrained optimization problems. For the sake of exploring its more interesting properties, a modified two-parameter scaled BFGS method is stated in this paper. The intention of the modified scaled BFGS method is to improve the eigenvalues structure of the BFGS update. In this method, the first two terms and the last term of the standard BFGS update formula are scaled with two different positive parameters, and the new value of yk is given. Meanwhile, Yuan-Wei-Lu line search is also proposed. Under the mentioned line search, the modified two-parameter scaled BFGS method is globally convergent for nonconvex functions. The extensive numerical experiments show that this form of the scaled BFGS method outperforms the standard BFGS method or some similar scaled methods.


Author(s):  
Ion Necoara ◽  
Martin Takáč

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.


Sign in / Sign up

Export Citation Format

Share Document