Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis

A serious computational bottle-neck in finite element analysis today is the solution of the underlying system of equations. To alleviate this problem, researchers have proposed the use of graphics programmable units (GPU) for fast iterative solution of such equations. Indeed, researchers have shown that a GPU-implementation of a double-precision sparse-matrix-vector multiplication (that underlies all iterative methods) is approximately an order of magnitude faster than that of an optimized CPU implementation. Unfortunately, fast matrix-vector multiplication alone is insufficient… a good preconditioner is necessary for rapid convergence. Furthermore, most modern preconditioners, such as incomplete Cholesky, are expensive to compute, and cannot be easily ported to the GPU. In this paper, we propose a special class of preconditioners for the analysis of thin structures, such as beams and plates. The proposed preconditioners are developed by combining the multi-grid method, with recently developed dual-representation method for thin structures. It is shown, that these preconditioners are computationally inexpensive, perform better than standard pre-conditioners, and can be easily ported to the GPU.

Download Full-text

GPU Accelerated Reconstruction in Compton Scattering Tomography Using Matrix Compression

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.519-520.102 ◽

2014 ◽

Vol 519-520 ◽

pp. 102-107

Author(s):

Yu Fei Yu ◽

Bin Yan ◽

Biao Wang ◽

Lei Li ◽

Yu Han ◽

...

Keyword(s):

Compton Scattering ◽

Graphics Processing Unit ◽

Sparse Matrix ◽

Reconstruction Algorithm ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Speedup Ratio ◽

Parallel Features ◽

Graphics Processing ◽

Matrix Vector

An acceleration strategy for TV-ADM reconstruction algorithm in Compton scattering tomography (CST) is proposed. By analyzing the sparse characteristic of CST projection matrixes, firstly, the sparse matrix vector CSR format and ELL format are used to store them, which greatly reduce the memory consumption. Then, a Sparse Matrix Vector multiplication (SpMV) method is utilized to accelerate the projector and back projector process. Finally, based on the parallel features, the TV-ADM is computed with Graphics Processing Unit (GPU). Numerical experiments show that the TV-ADM with the presented acceleration strategy could achieve a 96 times speedup ratio and 224 times memory compression ratio without precision loss.

Download Full-text

Comparison of GPU-Based Parallel Assembly and Assembly-Free Sparse Matrix Vector Multiplication for Finite Element Analysis of Three-Dimensional Structures

Proceedings of the Fifteenth International Conference on Civil, Structural and Environmental Engineering Computing ◽

10.4203/ccp.108.222 ◽

2015 ◽

Cited By ~ 1

Author(s):

A. Akbariyeh ◽

B.H. Dennis ◽

B.P. Wang ◽

K.L. Lawrence

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Sparse Matrix ◽

Three Dimensional ◽

Element Analysis ◽

Matrix Vector Multiplication ◽

Parallel Assembly ◽

Matrix Vector

Download Full-text

Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.2896 ◽

2012 ◽

Vol 25 (4) ◽

pp. 586-603 ◽

Cited By ~ 4

Author(s):

Bertil Schmidt ◽

Hans Aribowo ◽

Hoang-Vu Dang

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

A novel multi-graphics processing unit parallel optimization framework for the sparse matrix-vector multiplication

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.3936 ◽

2016 ◽

Vol 29 (5) ◽

pp. e3936 ◽

Cited By ~ 10

Author(s):

Jiaquan Gao ◽

Yu Wang ◽

Jun Wang

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Parallel Optimization ◽

Processing Unit ◽

Optimization Framework ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

A new diagonal storage for efficient implementation of sparse matrix–vector multiplication on graphics processing unit

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6230 ◽

2021 ◽

Author(s):

Guixia He ◽

Qi Chen ◽

Jiaquan Gao

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Efficient Implementation ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

Finite element method completely implemented for graphic processor units using parallel algorithm libraries

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017694703 ◽

2017 ◽

Vol 33 (1) ◽

pp. 53-66 ◽

Cited By ~ 1

Author(s):

Franz Pichler ◽

Gundolf Haase

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Computational Cost ◽

Processing Unit ◽

Time Step ◽

Device Architecture ◽

Transient Problems ◽

Speed Up ◽

Automotive Batteries ◽

Graphics Processing

A finite element code is developed in which all of the computationally expensive steps are performed on a graphics processing unit via the THRUST and the PARALUTION libraries. The code focuses on the simulation of transient problems where the repeated computations per time-step create the computational cost. It is used to solve partial and ordinary differential equations as they arise in thermal-runaway simulations of automotive batteries. The speed-up obtained by utilizing the graphics processing unit for every critical step is compared against the single core and the multi-threading solutions which are also supported by the chosen libraries. This way a high total speed-up on the graphics processing unit is achieved without the need for programming a single classical Compute Unified Device Architecture kernel.

Download Full-text

A three‐stage graphics processing unit‐based finite element analyses matrix generation strategy for unstructured meshes

International Journal for Numerical Methods in Engineering ◽

10.1002/nme.6383 ◽

2020 ◽

Vol 121 (17) ◽

pp. 3824-3848 ◽

Cited By ~ 1

Author(s):

Subhajit Sanfui ◽

Deepak Sharma

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Unstructured Meshes ◽

Processing Unit ◽

Finite Element Analyses ◽

Graphics Processing

Download Full-text

High-Speed Nonlinear Finite Element Analysis for Surgical Simulation Using Graphics Processing Units

IEEE Transactions on Medical Imaging ◽

10.1109/tmi.2007.913112 ◽

2008 ◽

Vol 27 (5) ◽

pp. 650-663 ◽

Cited By ~ 109

Author(s):

Z.A. Taylor ◽

M. Cheng ◽

S. Ourselin

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Graphics Processing Units ◽

High Speed ◽

Surgical Simulation ◽

Nonlinear Finite Element Analysis ◽

Nonlinear Finite Element ◽

Element Analysis ◽

Graphics Processing

Download Full-text