Fast Randomized Non-Hermitian Eigensolvers Based on Rational Filtering and Matrix Partitioning

2021 ◽  
pp. S791-S815
Author(s):  
Vassilis Kalantzis ◽  
Yuanzhe Xi ◽  
Lior Horesh
Keyword(s):  
Author(s):  
Akrem Benatia ◽  
Weixing Ji ◽  
Yizhuo Wang ◽  
Feng Shi

Sparse matrix–vector multiplication (SpMV) kernel dominates the computing cost in numerous applications. Most of the existing studies dedicated to improving this kernel have been targeting just one type of processing units, mainly multicore CPUs or graphics processing units (GPUs), and have not explored the potential of the recent, rapidly emerging, CPU-GPU heterogeneous platforms. To take full advantage of these heterogeneous systems, the input sparse matrix has to be partitioned on different available processing units. The partitioning problem is more challenging with the existence of many sparse formats whose performances depend both on the sparsity of the input matrix and the used hardware. Thus, the best performance does not only depend on how to partition the input sparse matrix but also on which sparse format to use for each partition. To address this challenge, we propose in this article a new CPU-GPU heterogeneous method for computing the SpMV kernel that combines between different sparse formats to achieve better performance and better utilization of CPU-GPU heterogeneous platforms. The proposed solution horizontally partitions the input matrix into multiple block-rows and predicts their best sparse formats using machine learning-based performance models. A mapping algorithm is then used to assign the block-rows to the CPU and GPU(s) available in the system. Our experimental results using real-world large unstructured sparse matrices on two different machines show a noticeable performance improvement.


2005 ◽  
Vol 127 (1) ◽  
pp. 12-23 ◽  
Author(s):  
Li Chen ◽  
Zhendong Ding ◽  
Simon Li

We have developed a formal method for decomposition of complex design problems in two phases: dependency analysis and matrix partitioning. Of the most distinct characteristic in this method is the support of cost-effective re-decomposition (as is often required in decomposition solution synthesis), where dependency analysis serves as a platform for the enabling of re-decomposition. Yet, this requires that the result of the dependency analysis be robust and thus reusable for re-decomposition. In this paper, after revealing the deficiency in the current practice of dependency analysis, we present an enhanced dependency analysis method that is built on ordinary tree structure (instead of binary tree structure). This new approach, which is more systematic, ensures robust dependency analysis, whose result is insensitive to the arrangement of a tree structure in tree-based dependency analysis. A complete set of tree-based algorithms is also provided, along with their applications to two design examples


2010 ◽  
Vol 310 (12) ◽  
pp. 1793-1801
Author(s):  
Yoomi Rho
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document