scholarly journals Performance Optimization of Multithreaded 2D Fast Fourier Transform on Multicore Processors Using Load Imbalancing Parallel Computing Method

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 64202-64224 ◽  
Author(s):  
Semyon Khokhriakov ◽  
Ravi Reddy Manumachu ◽  
Alexey Lastovetsky
2015 ◽  
Vol 2015 ◽  
pp. 1-7
Author(s):  
Pablo Soto-Quiros

This paper presents a parallel implementation of a kind of discrete Fourier transform (DFT): the vector-valued DFT. The vector-valued DFT is a novel tool to analyze the spectra of vector-valued discrete-time signals. This parallel implementation is developed in terms of a mathematical framework with a set of block matrix operations. These block matrix operations contribute to analysis, design, and implementation of parallel algorithms in multicore processors. In this work, an implementation and experimental investigation of the mathematical framework are performed using MATLAB with the Parallel Computing Toolbox. We found that there is advantage to use multicore processors and a parallel computing environment to minimize the high execution time. Additionally, speedup increases when the number of logical processors and length of the signal increase.


Author(s):  
David W Walker

This article investigates the recursive Morton ordering of two-dimensional arrays as an efficient way to access hierarchical memory across a range of heterogeneous computer platforms, ranging from manycore devices, multicore processors, clusters and distributed environments. A brief overview of previous research in this area is given, and algorithms that make use of Morton ordering are described. These are then used to investigate the efficiency of the Morton ordering approach by performance experiments on different processors. In particular, timing results are presented for matrix multiplication, Cholesky factorization and fast Fourier transform algorithms. The use of the Morton ordering approach leads naturally to algorithms that are recursive and exposes parallelism at each level of recursion. Thus, the approach advocated in this talk not only provides convenient and efficient access to hierarchical memory but also provides a basis for exploiting parallelism.


Sign in / Sign up

Export Citation Format

Share Document