Programming the Linpack Benchmark for the IBM PowerXCell 8i Processor

Michael Kistler; John Gunnels; Daniel Brokenshire; Brad Benton

doi:10.1155/2009/401691

Programming the Linpack Benchmark for the IBM PowerXCell 8i Processor

Scientific Programming ◽

10.1155/2009/401691 ◽

2009 ◽

Vol 17 (1-2) ◽

pp. 43-57 ◽

Cited By ~ 4

Author(s):

Michael Kistler ◽

John Gunnels ◽

Daniel Brokenshire ◽

Brad Benton

Keyword(s):

High Speed ◽

Double Precision ◽

Data Movement ◽

Processing Elements ◽

Cell Broadband Engine ◽

Design And Implementation ◽

Computational Capability ◽

High Speed Data ◽

Linpack Benchmark ◽

And Performance

In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCenter QS22, which incorporates two IBM PowerXCell 8i1processors. The PowerXCell 8i is a new implementation of the Cell Broadband Engine™2 architecture and contains a set of special-purpose processing cores known as Synergistic Processing Elements (SPEs). The SPEs can be used as computational accelerators to augment the main PowerPC processor. The added computational capability of the SPEs results in a peak double precision floating point capability of 108.8 GFLOPS. We explain how we modified the standard open source implementation of Linpack to accelerate key computational kernels using the SPEs of the PowerXCell 8i processors. We describe in detail the implementation and performance of the computational kernels and also explain how we employed the SPEs for high-speed data movement and reformatting. The result of these modifications is a Linpack benchmark optimized for the IBM PowerXCell 8i processor that achieves 170.7 GFLOPS on a BladeCenter QS22 with 32 GB of DDR2 SDRAM memory. Our implementation of Linpack also supports clusters of QS22s, and was used to achieve a result of 11.1 TFLOPS on a cluster of 84 QS22 blades. We compare our results on a single BladeCenter QS22 with the base Linpack implementation without SPE acceleration to illustrate the benefits of our optimizations.

Download Full-text

Design and implementation of an ultra-high speed data acquisition system for HRRATI

2009 IEEE Symposium on Industrial Electronics & Applications ◽

10.1109/isiea.2009.5356476 ◽

2009 ◽

Cited By ~ 1

Author(s):

Bi Xin ◽

Du Jinsong ◽

Fan Wei

Keyword(s):

Data Acquisition ◽

High Speed ◽

Data Acquisition System ◽

Acquisition System ◽

Design And Implementation ◽

Ultra High Speed ◽

High Speed Data Acquisition ◽

High Speed Data

Download Full-text

Adjacency-Hash-Table Based Public Auditing for Data Integrity in Mobile Cloud Computing

Wireless Communications and Mobile Computing ◽

10.1155/2018/3471312 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12

Author(s):

Wenqi Chen ◽

Hui Tian ◽

Chin-Chen Chang ◽

Fulin Nan ◽

Jing Lu

Keyword(s):

Cloud Computing ◽

High Speed ◽

Security Analysis ◽

Hash Table ◽

The State ◽

Public Auditing ◽

The Arts ◽

High Speed Data ◽

And Performance ◽

Data Updating

Cloud storage, one of the core services of cloud computing, provides an effective way to solve the problems of storage and management caused by high-speed data growth. Thus, a growing number of organizations and individuals tend to store their data in the cloud. However, due to the separation of data ownership and management, it is difficult for users to check the integrity of data in the traditional way. Therefore, many researchers focus on developing several protocols, which can remotely check the integrity of data in the cloud. In this paper, we propose a novel public auditing protocol based on the adjacency-hash table, where dynamic auditing and data updating are more efficient than those of the state of the arts. Moreover, with such an authentication structure, computation and communication costs can be reduced effectively. The security analysis and performance evaluation based on comprehensive experiments demonstrate that our protocol can achieve all the desired properties and outperform the state-of-the-art ones in computing overheads for updating and verification.

Download Full-text

Design and Implementation of High Speed Data Transmission for X-Band Dual Polarized Weather Radar

2019 International Conference on Meteorology Observations (ICMO) ◽

10.1109/icmo49322.2019.9025887 ◽

2019 ◽

Author(s):

Gu Jian ◽

DuYu Ming ◽

LiYi Cheng

Keyword(s):

Data Transmission ◽

High Speed ◽

Weather Radar ◽

Design And Implementation ◽

X Band ◽

High Speed Data ◽

Dual Polarized ◽

Speed Data Transmission

Download Full-text

Development and performance evaluation of highly efficient retransmission scheme for high-speed data transfer via satellite

Electronics and Communications in Japan (Part I Communications) ◽

10.1002/ecja.4410730610 ◽

1990 ◽

Vol 73 (6) ◽

pp. 99-108

Author(s):

Tsutomu Nakamura ◽

Ryooichi Sasaki ◽

Nobuyuki Fujikura ◽

Hiroshi Morita

Keyword(s):

Performance Evaluation ◽

High Speed ◽

Data Transfer ◽

Highly Efficient ◽

High Speed Data ◽

And Performance

Download Full-text

Design and Implementation of High Speed Data Transmission for X-Band Dual Polarized Weather Radar

2019 International Conference on Meteorology Observations (ICMO) ◽

10.1109/icmo49322.2019.9025868 ◽

2019 ◽

Author(s):

Gu Jian ◽

DuYu Ming ◽

LiYi Cheng

Keyword(s):

Data Transmission ◽

High Speed ◽

Weather Radar ◽

Design And Implementation ◽

X Band ◽

High Speed Data ◽

Dual Polarized ◽

Speed Data Transmission

Download Full-text

Design and Implementation of Digital Modulator In High-Speed Data Transmission System

2019 International Conference on Control, Automation and Information Sciences (ICCAIS) ◽

10.1109/iccais46528.2019.9074568 ◽

2019 ◽

Author(s):

Lili Zhang ◽

Wen Kuang

Keyword(s):

Data Transmission ◽

High Speed ◽

Transmission System ◽

Design And Implementation ◽

High Speed Data ◽

Data Transmission System ◽

Speed Data Transmission

Download Full-text

Lithium niobate devices in switching and multiplexing

Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences ◽

10.1098/rsta.1989.0060 ◽

1989 ◽

Vol 329 (1603) ◽

pp. 83-92 ◽

Cited By ~ 4

Keyword(s):

Lithium Niobate ◽

High Speed ◽

Optical Switching ◽

Integrated Optics Devices ◽

Photonic Switching ◽

Time Space ◽

Trade Offs ◽

The Status ◽

High Speed Data ◽

And Performance

Integrated-optics devices in lithium niobate have reached a significant maturity in recent years, and several complex devices have been demonstrated. In addition to performing modulation of light in fibre-optic transmission systems, lithium niobate devices currently offer the only components for photonic switching. Thus lithium niobate devices can be used as spatial, temporal and wavelength switches in high-speed and low-speed systems. In these systems electronic signals control the lithium niobate switches, which process the optical information and which are optically interfaced to optical fibres. Hence I am not concerned with all-optical switching. Examples of applications are multiplexing and demultiplexing of high-speed data streams, bit-by-bit or word-by-word switching in, for example, time-space-time stages or in access couplers in high-speed bus systems. Switch arrays, generally operating at lower speeds (below 1 GHz), can be used for network rearrangement, digital crossconnect, protection switching and generally in situations where the frequency and code transparency of the devices can be used to advantage. The status of lithium niobate devices for switching is reviewed, and performance limitations (including those imposed by polarization properties) and trade-offs are discussed, emphasizing time- and space-switching devices and applications.

Download Full-text

The Design and Implementation of High-Speed Data Acquisition System Based on NIOS II

2010 International Conference on Computing, Control and Industrial Engineering ◽

10.1109/ccie.2010.201 ◽

2010 ◽

Author(s):

Wang Wei ◽

Zhong Guidong

Keyword(s):

Data Acquisition ◽

High Speed ◽

Data Acquisition System ◽

Acquisition System ◽

Nios Ii ◽

Design And Implementation ◽

High Speed Data Acquisition ◽

High Speed Data

Download Full-text

Wavelength-division-multiplexing (WDM)-based integrated electronic–photonic switching network (EPSN) for high-speed data processing and transportation

Nanophotonics ◽

10.1515/nanoph-2020-0356 ◽

2020 ◽

Vol 9 (15) ◽

pp. 4579-4588

Author(s):

Chenghao Feng ◽

Zhoufeng Ying ◽

Zheng Zhao ◽

Jiaqi Gu ◽

David Z. Pan ◽

...

Keyword(s):

Wavelength Division Multiplexing ◽

High Speed ◽

High Performance ◽

Performance Enhancement ◽

Switching Network ◽

Photonic Switching ◽

Wavelength Division ◽

High Speed Data ◽

And Performance ◽

Performance Computing

AbstractIntegrated photonics offers attractive solutions for realizing combinational logic for high-performance computing. The integrated photonic chips can be further optimized using multiplexing techniques such as wavelength-division multiplexing (WDM). In this paper, we propose a WDM-based electronic–photonic switching network (EPSN) to realize the functions of the binary decoder and the multiplexer, which are fundamental elements in microprocessors for data transportation and processing. We experimentally demonstrate its practicality by implementing a 3–8 (three inputs, eight outputs) switching network operating at 20 Gb/s. Detailed performance analysis and performance enhancement techniques are also given in this paper.

Download Full-text

Design and Implementation of 6-Stage 64-bit MIPS Pipelined Architecture

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1201.0886s219 ◽

2019 ◽

Vol 8 (6S2) ◽

pp. 790-796

Keyword(s):

Low Power ◽

High Speed ◽

High Performance ◽

Random Access ◽

Instruction Set ◽

Cache Memories ◽

Design And Implementation ◽

Pipelined Architecture ◽

Risc Processor ◽

High Speed Data

Pipelining is the concept of overlapping of multiple instructions to perform their operations to optimize the time and ability of hardware units. This paper presents the design and implementation of 6 stage pipelined architecture for High performance 64-bit Microprocessor without Interlocked Pipeline Stages (MIPS) based Reduced Instruction set computing (RISC) processor. In this work, combining efforts of pre-fetching unit, forwarding unit, Branch and Jump predicting unit, Hazard unit are used to reduce the hazards. Low power unit is used to minimize the power. Cache Memories, other devices and especially balancing pipeline stages optimize the Speed in this work. DDR4 SDRAM (Double Data Rate type4 Synchronous Dynamic Random Access Memory) controller is employed in this pipeline to achieve high-speed data transfers and to manage the entire system efficiently. Low power, Low delay Flip flops are used in pipeline registers that implicitly enhance the performance of the system. The proposed method provides better results compared to the existing models. The simulation and synthesis results of the proposed Architecture are evaluated by Xilinx 14.7 software and supporting graphs are plotted through MATLAB tool

Download Full-text