Cholesky Factorization on Heterogeneous CPU and GPU Systems

Fast kernel k-means clustering using incomplete Cholesky factorization

Applied Mathematics and Computation ◽

10.1016/j.amc.2021.126037 ◽

2021 ◽

Vol 402 ◽

pp. 126037

Author(s):

Li Chen ◽

Shuisheng Zhou ◽

Jiajun Ma ◽

Mingliang Xu

Keyword(s):

Cholesky Factorization ◽

Incomplete Cholesky Factorization

Download Full-text

High-performance sampling of generic determinantal point processes

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2019.0059 ◽

2020 ◽

Vol 378 (2166) ◽

pp. 20190059 ◽

Cited By ~ 1

Author(s):

Jack Poulson

Keyword(s):

Point Process ◽

Spectral Decomposition ◽

High Performance ◽

Point Processes ◽

Cholesky Factorization ◽

Determinantal Point Processes ◽

Determinantal Point Process ◽

Decomposition Approach ◽

Map Inference ◽

Sampling Schemes

Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7 , 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender system (Kulesza & Taskar 2012 Found. Trends Mach. Learn. 5 , 123–286). The standard sampling scheme for finite DPPs is a spectral decomposition followed by an equivalent of a randomly diagonally pivoted Cholesky factorization of an orthogonal projection, which is only applicable to Hermitian kernels and has an expensive set-up cost. Researchers Launay et al. 2018 ( http://arxiv.org/abs/1802.08429 ); Chen & Zhang 2018 NeurIPS ( https://papers.nips.cc/paper/7805-fast-greedy-map-inference-for-determinantal-point-process-to-improve-recommendation-diversity.pdf ) have begun to connect DPP sampling to LDL H factorizations as a means of avoiding the initial spectral decomposition, but existing approaches have only outperformed the spectral decomposition approach in special circumstances, where the number of kept modes is a small percentage of the ground set size. This article proves that trivial modifications of LU and LDL H factorizations yield efficient direct sampling schemes for non-Hermitian and Hermitian DPP kernels, respectively. Furthermore, it is experimentally shown that even dynamically scheduled, shared-memory parallelizations of high-performance dense and sparse-direct factorizations can be trivially modified to yield DPP sampling schemes with essentially identical performance. The software developed as part of this research, Catamari ( hodgestar.com/catamari ) is released under the Mozilla Public License v.2.0. It contains header-only, C++14 plus OpenMP 4.0 implementations of dense and sparse-direct, Hermitian and non-Hermitian DPP samplers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Download Full-text

Parsimonious kernel extreme learning machine in primal via Cholesky factorization

Neural Networks ◽

10.1016/j.neunet.2016.04.009 ◽

2016 ◽

Vol 80 ◽

pp. 95-109 ◽

Cited By ~ 14

Author(s):

Yong-Ping Zhao

Keyword(s):

Extreme Learning Machine ◽

Cholesky Factorization ◽

Kernel Extreme Learning Machine ◽

Learning Machine

Download Full-text

LOW AND HIGH ORDER FINITE ELEMENT METHOD: EXPERIENCE IN SEISMIC MODELING

Journal of Computational Acoustics ◽

10.1142/s0218396x94000233 ◽

1994 ◽

Vol 02 (04) ◽

pp. 371-422 ◽

Cited By ~ 47

Author(s):

E. PADOVANI ◽

E. PRIOLO ◽

G. SERIANI

Keyword(s):

Finite Element Method ◽

Finite Element ◽

Computational Efficiency ◽

High Order ◽

Cholesky Factorization ◽

Complex Geometries ◽

Spectral Elements ◽

Seismic Modeling ◽

Static Condensation ◽

Element Method

The finite element method (FEM) is a numerical technique well suited to solving problems of elastic wave propagation in complex geometries and heterogeneous media. The main advantages are that very irregular grids can be used, free surface boundary conditions can be easily taken into account, a good reconstruction is possible of irregular surface topography, and complex geometries, such as curved, dipping and rough interfaces, intrusions, cusps, and holes can be defined. The main drawbacks of the classical approach are the need for a large amount of memory, low computational efficiency, and the possible appearance of spurious effects. In this paper we describe some experience in improving the computational efficiency of a finite element code based on a global approach, and used for seismic modeling in geophysical oil exploration. Results from the use of different methods and models run on a mini-superworkstation APOLLO DN10000 are reported and compared. With Chebyshev spectral elements, great accuracy can be reached with almost no numerical artifacts. Static condensation of the spectral element's internal nodes dramatically reduces memory requirements and CPU time. Time integration performed with the classical implicit Newmark scheme is very accurate but not very efficient. Due to the high sparsity of the matrices, the use of compressed storage is shown to greatly reduce not only memory requirements but also computing time. The operation which most affects the performance is the matrix-by-vector product; an effective programming of this subroutine for the storage technique used is decisive. The conjugate gradient method preconditioned by incomplete Cholesky factorization provides, in general, a good compromise between efficiency and memory requirements. Spectral elements greatly increase its efficiency, since the number of iterations is reduced. The most efficient and accurate method is a hybrid iterative-direct solution of the linear system arising from the static condensation of high order elements. The size of 2D models that can be handled in a reasonable time on this kind of computer is nowadays hardly sufficient, and significant 3D modeling is completely unfeasible. However the introduction of new FEM algorithms coupled with the use of new computer architectures is encouraging for the future.

Download Full-text

Interactively Exploring the Connection between Nested Dissection Orderings for Parallel Cholesky Factorization and Vertex Separators

2014 IEEE International Parallel & Distributed Processing Symposium Workshops ◽

10.1109/ipdpsw.2014.125 ◽

2014 ◽

Cited By ~ 2

Author(s):

H. Martin Bucker ◽

M. Ali Rostami

Keyword(s):

Cholesky Factorization ◽

Nested Dissection ◽

Vertex Separators

Download Full-text

FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAs

10.1109/sbac-pad53543.2021.00032 ◽

2021 ◽

Author(s):

Erfan Bank Tavakoli ◽

Michael Riera ◽

Masudul Hassan Quraishi ◽

Fengbo Ren

Keyword(s):

Cholesky Factorization ◽

Sparse Cholesky Factorization

Download Full-text

Efficient Parallel Algorithms for Dense Cholesky Factorization

Parallel Computation - Lecture Notes in Computer Science ◽

10.1007/3-540-49164-3_69 ◽

1999 ◽

pp. 600-602

Author(s):

Eunice E. Santos ◽

Pei-Yue Pauline Chu

Keyword(s):

Parallel Algorithms ◽

Cholesky Factorization

Download Full-text

Linear statistical models for stationary sequences and related algorithms for Cholesky factorization of Toeplitz matrices

IEEE Transactions on Acoustics Speech and Signal Processing ◽

10.1109/tassp.1987.1165035 ◽

1987 ◽

Vol 35 (1) ◽

pp. 29-42 ◽

Cited By ~ 14

Author(s):

C. Demeure ◽

L. Scharf

Keyword(s):

Statistical Models ◽

Toeplitz Matrices ◽

Cholesky Factorization ◽

Stationary Sequences ◽

Linear Statistical Models

Download Full-text

divand-1.0: <i>n</i>-dimensional variational data analysis for ocean observations

Geoscientific Model Development ◽

10.5194/gmd-7-225-2014 ◽

2014 ◽

Vol 7 (1) ◽

pp. 225-241 ◽

Cited By ~ 14

Author(s):

A. Barth ◽

J.-M. Beckers ◽

C. Troupin ◽

A. Alvera-Azcárate ◽

L. Vandenbulcke

Keyword(s):

Cost Function ◽

Dimensional Analysis ◽

Computational Cost ◽

Posteriori Error ◽

Cholesky Factorization ◽

Ocean Current ◽

Posteriori Error Estimate ◽

Two Dimensional ◽

A Posteriori ◽

A Posteriori Error

Abstract. A tool for multidimensional variational analysis (divand) is presented. It allows the interpolation and analysis of observations on curvilinear orthogonal grids in an arbitrary high dimensional space by minimizing a cost function. This cost function penalizes the deviation from the observations, the deviation from a first guess and abruptly varying fields based on a given correlation length (potentially varying in space and time). Additional constraints can be added to this cost function such as an advection constraint which forces the analysed field to align with the ocean current. The method decouples naturally disconnected areas based on topography and topology. This is useful in oceanography where disconnected water masses often have different physical properties. Individual elements of the a priori and a posteriori error covariance matrix can also be computed, in particular expected error variances of the analysis. A multidimensional approach (as opposed to stacking two-dimensional analysis) has the benefit of providing a smooth analysis in all dimensions, although the computational cost is increased. Primal (problem solved in the grid space) and dual formulations (problem solved in the observational space) are implemented using either direct solvers (based on Cholesky factorization) or iterative solvers (conjugate gradient method). In most applications the primal formulation with the direct solver is the fastest, especially if an a posteriori error estimate is needed. However, for correlated observation errors the dual formulation with an iterative solver is more efficient. The method is tested by using pseudo-observations from a global model. The distribution of the observations is based on the position of the Argo floats. The benefit of the three-dimensional analysis (longitude, latitude and time) compared to two-dimensional analysis (longitude and latitude) and the role of the advection constraint are highlighted. The tool divand is free software, and is distributed under the terms of the General Public Licence (GPL) (http://modb.oce.ulg.ac.be/mediawiki/index.php/divand).

Download Full-text

Modifying the Cholesky Factorization on MIMD Distributed Memory Machines

Applied Optimization - High Performance Algorithms and Software in Nonlinear Optimization ◽

10.1007/978-1-4613-3279-4_9 ◽

1998 ◽

pp. 125-141 ◽

Cited By ~ 1

Author(s):

Marco D’Apuzzo ◽

Valentina De Simone ◽

Marina Marino ◽

Gerardo Toraldo

Keyword(s):

Distributed Memory ◽

Cholesky Factorization ◽

Distributed Memory Machines

Download Full-text