scholarly journals Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime

Author(s):  
Ichitaro Yamazaki ◽  
Jakub Kurzak ◽  
Piotr Luszczek ◽  
Jack Dongarra
2014 ◽  
Vol 24 (04) ◽  
pp. 1442004
Author(s):  
Ichitaro Yamazaki ◽  
Jakub Kurzak ◽  
Piotr Luszczek ◽  
Jack Dongarra

A systolic array provides an alternative computing paradigm to the von Neumann architecture. Though its hardware implementation has failed as a paradigm to design integrated circuits in the past, we are now discovering that the systolic array as a software virtualization layer can lead to an extremely scalable execution paradigm. To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix. Our implementation is based on a state-of-the-art algorithm that factorizes a panel based on a tree-reduction. Freed from the constraint of a planar layout, we present a three-dimensional virtual systolic array architecture for this algorithm. Using a runtime developed as a part of the Parallel Ultra Light Systolic Array Runtime (PULSAR) project, we demonstrate on a Cray-XT5 machine how our virtual systolic array can be mapped to a large-scale machine and obtain excellent parallel performance. This is an important contribution since such a QR decomposition is used, for example, to compute a least squares solution of an overdetermined system, which arises in many scientific and engineering problems.


Integration ◽  
2016 ◽  
Vol 53 ◽  
pp. 1-13 ◽  
Author(s):  
Ning Ma ◽  
Zhuo Zou ◽  
Zhonghai Lu ◽  
Lirong Zheng

2021 ◽  
Vol I (I) ◽  
Author(s):  
S Lakshmi Narayanan ◽  
Robert Theivadas J

MIMO is a wireless technology that uses large scale antennas to transfer more data at the same time and to increase spectral efficiency. To achieve high data rate with less bandwidth we use decomposition algorithm. Among various de-composition algorithm QR decomposition algorithm outperforms low bit error rate(BER), but the computational complexity is prohibitively high when the system incorporates large number of antennas. This paper presents a low computational sorted QR decomposition (SQRD) algorithm for MIMO.SQRD uses precoding technique at the transmitter which decomposes the channel that can sent in parallel.


2016 ◽  
Vol 45 (9) ◽  
pp. 935003
Author(s):  
曹 强 Cao Qiang ◽  
严文瑞 Yan Wenrui ◽  
姚 杰 Yao Jie ◽  
谢长生 Xie Changsheng

Sign in / Sign up

Export Citation Format

Share Document