Approximate Element Computational Time for Domain Decomposition in Parallel Finite Element Code

Author(s):  
Thuy T. Le
2011 ◽  
Vol 462-463 ◽  
pp. 605-610 ◽  
Author(s):  
Hiroshi Kawai ◽  
Masao Ogino ◽  
Ryuji Shioya ◽  
Shinobu Yoshimura

To solve a large scale elasto-plastic dynamics analysis of a complicated structure, such as a seismic analysis of a nuclear power plant and a skyscraper, a new implementation strategy for a parallel finite element code, suitable on a parallel supercomputer with modern multi-core / many core scalar CPUs, has been required. In this work, we propose a new design and programming style to optimize the performance of a parallel finite element code based on the domain-decomposition method (DDM) on multi-core CPUs, considering their cache hierarchy. Instead of a traditional, memory access-intensive approach, DS (Direct solver-based matrix Storage), two new matrix storage-free approaches, DSF (Direct solver-based matrix Storage-Free) and ISF (Iterative solver-based matrix Storage-Free), are proposed. Our new DSF/ISF-based DDM solver is not only more efficient in memory usage but also comparable in computational time against existing DS-based DDM solvers on multi-core CPU architectures.


Author(s):  
J. Rodriguez ◽  
J. Sun

Abstract The primary objective of this study was the implementation and comparison of domain decomposition algorithms for a parallel Finite Element Method (FEM) used in the area of Computational Structural Mechanics (CSM). A parallelized FEM code exploits the concurrency inherent in the method to improve its computational efficiency. In order to use a larger size granularity in the parallel computation, the parallel FEM needs to partition its domain into subdomains in a proper manner. It is therefore necessary to search for domain decomposition algorithms to satisfy the special requirements of a parallel FEM. The domain decomposition algorithms investigated in this study physically decompose a meshed domain into a desired number of subdomains. Addressing the requirements of the parallel FEM, these algorithms are able to handle any type of two- and three-dimensional domains, balance the workloads across the multiple processors, minimize the communication overhead among the processors, maintain the integrity of each subdomain, minimize the overall bandwidth of the resulting system matrix, and require only a small amount of CPU time for the decomposition. Modifications to existing decomposition algorithms, such as the single wave propagating method and the bisecting method using vertical/horizontal cuts, are investigated. A new algorithm, based on the proposed multiple wave propagating method and the bisecting method using middle cuts, is formulated. These algorithms are compared with each other using performance criteria based on the overall FEM code and the algorithms themselves. An optimal combination algorithm is proposed. This algorithm combination is flexible and intelligent in some sense since several judgements are suggested to guide and organize different decompositions based on the general geometry of the meshes. The combination algorithm possesses both the desirable features of wave propagating and bisecting methods. As an application, the present algorithm is included in an existing parallel FEM code and some improvements in this code are made. The overall efficiency of the FEM code was increased.


Author(s):  
Noriyuki Kushida ◽  
Hiroshi Okuda ◽  
Genki Yagawa

In this paper, the convergence behavior of large-scale parallel finite element method for the stress singular problems was investigated. The convergence behavior of iterative solvers depends on the efficiency of the preconditioners. However, efficiency of preconditioners may be influenced by the domain decomposition that is necessary for parallel FEM. In this study the following results were obtained: Conjugate gradient method without preconditioning and the diagonal scaling preconditioned conjugate gradient method were not influenced by the domain decomposition as expected. symmetric successive over relaxation method preconditioned conjugate gradient method converged 6% faster as maximum if the stress singular area was contained in one sub-domain.


Sign in / Sign up

Export Citation Format

Share Document