A Model for Weak Scaling to Many GPUs at the Basis of the Linpack Benchmark

Random circuit block-encoded matrix and a proposal of quantum LINPACK benchmark

Physical Review A ◽

10.1103/physreva.103.062412 ◽

2021 ◽

Vol 103 (6) ◽

Author(s):

Yulong Dong ◽

Lin Lin

Keyword(s):

Linpack Benchmark

Download Full-text

Research of Register Pressure Aware Loop Unrolling Optimizations for Compiler

MATEC Web of Conferences ◽

10.1051/matecconf/201822803008 ◽

2018 ◽

Vol 228 ◽

pp. 03008

Author(s):

Xuehua Liu ◽

Liping Ding ◽

Yanfeng Li ◽

Guangxuan Chen ◽

Jin Du

Keyword(s):

Finite Number ◽

Infinite Number ◽

Performance Degradation ◽

Transformation Process ◽

Fine Grained ◽

Loop Unrolling ◽

Average Improvement ◽

Register Pressure ◽

Linpack Benchmark ◽

Loop Optimizations

Register pressure problem has been a known problem for compiler because of the mismatch between the infinite number of pseudo registers and the finite number of hard registers. Too heavy register pressure may results in register spilling and then leads to performance degradation. There are a lot of optimizations, especially loop optimizations suffer from register spilling in compiler. In order to fight register pressure and therefore improve the effectiveness of compiler, this research takes the register pressure into account to improve loop unrolling optimization during the transformation process. In addition, a register pressure aware transformation is able to reduce the performance overhead of some fine-grained randomization transformations which can be used to defend against ROP attacks. Experiments showed a peak improvement of about 3.6% and an average improvement of about 1% for SPEC CPU 2006 benchmarks and a peak improvement of about 3% and an average improvement of about 1% for the LINPACK benchmark.

Download Full-text

Programming the Linpack Benchmark for the IBM PowerXCell 8i Processor

Scientific Programming ◽

10.1155/2009/401691 ◽

2009 ◽

Vol 17 (1-2) ◽

pp. 43-57 ◽

Cited By ~ 4

Author(s):

Michael Kistler ◽

John Gunnels ◽

Daniel Brokenshire ◽

Brad Benton

Keyword(s):

High Speed ◽

Double Precision ◽

Data Movement ◽

Processing Elements ◽

Cell Broadband Engine ◽

Design And Implementation ◽

Computational Capability ◽

High Speed Data ◽

Linpack Benchmark ◽

And Performance

In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCenter QS22, which incorporates two IBM PowerXCell 8i1processors. The PowerXCell 8i is a new implementation of the Cell Broadband Engine™2 architecture and contains a set of special-purpose processing cores known as Synergistic Processing Elements (SPEs). The SPEs can be used as computational accelerators to augment the main PowerPC processor. The added computational capability of the SPEs results in a peak double precision floating point capability of 108.8 GFLOPS. We explain how we modified the standard open source implementation of Linpack to accelerate key computational kernels using the SPEs of the PowerXCell 8i processors. We describe in detail the implementation and performance of the computational kernels and also explain how we employed the SPEs for high-speed data movement and reformatting. The result of these modifications is a Linpack benchmark optimized for the IBM PowerXCell 8i processor that achieves 170.7 GFLOPS on a BladeCenter QS22 with 32 GB of DDR2 SDRAM memory. Our implementation of Linpack also supports clusters of QS22s, and was used to achieve a result of 11.1 TFLOPS on a cluster of 84 QS22 blades. We compare our results on a single BladeCenter QS22 with the base Linpack implementation without SPE acceleration to illustrate the benefits of our optimizations.

Download Full-text

Complex version of high performance computing LINPACK benchmark (HPL)

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.1476 ◽

2009 ◽

pp. n/a-n/a ◽

Cited By ~ 2

Author(s):

R. F. Barrett ◽

T. H. F. Chan ◽

E. F. D'Azevedo ◽

E. F. Jaeger ◽

K. Wong ◽

...

Keyword(s):

High Performance Computing ◽

High Performance ◽

Linpack Benchmark ◽

Performance Computing

Download Full-text

LINPACK Benchmark

Encyclopedia of Parallel Computing ◽

10.1007/978-0-387-09766-4_155 ◽

2011 ◽

pp. 1033-1036 ◽

Cited By ~ 15

Author(s):

Jack Dongarra ◽

Piotr Luszczek ◽

Paul Feautrier ◽

Field G. Zee ◽

Ernie Chan ◽

...

Keyword(s):

Linpack Benchmark

Download Full-text

A Message-Passing Hardware/Software Cosimulation Environment for Reconfigurable Computing Systems

International Journal of Reconfigurable Computing ◽

10.1155/2009/376232 ◽

2009 ◽

Vol 2009 ◽

pp. 1-9

Author(s):

Manuel Saldaña ◽

Emanuel Ramalho ◽

Paul Chow

Keyword(s):

Reconfigurable Computing ◽

Message Passing ◽

High Performance ◽

System Level ◽

Application Development ◽

Reconfigurable Computers ◽

Development Tool ◽

Verification Tools ◽

Linpack Benchmark ◽

Xilinx Fpga

High-performance reconfigurable computers (HPRCs) provide a mix of standard processors and FPGAs to collectively accelerate applications. This introduces new design challenges, such as the need for portable programming models across HPRCs and system-level verification tools. To address the need for cosimulating a complete heterogeneous application using both software and hardware in an HPRC, we have created a tool called the Message-passing Simulation Framework (MSF). We have used it to simulate and develop an interface enabling an MPI-based approach to exchange data between X86 processors and hardware engines inside FPGAs. The MSF can also be used as an application development tool that enables multiple FPGAs in simulation to exchange messages amongst themselves and with X86 processors. As an example, we simulate a LINPACK benchmark hardware core using an Intel-FSB-Xilinx-FPGA platform to quickly prototype the hardware, to test the communications. and to verify the benchmark results.

Download Full-text