A General-Purpose AMG Linear Solver for High Performance Computing

Optimization Strategies for High-Performance Computing of Optical-Flow in General-Purpose Processors

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/tcsvt.2009.2026821 ◽

2009 ◽

Vol 19 (10) ◽

pp. 1475-1488 ◽

Cited By ~ 24

Author(s):

M. Anguita ◽

J. Diaz ◽

E. Ros ◽

F.J. Fernandez-Baldomero

Keyword(s):

Optical Flow ◽

High Performance Computing ◽

High Performance ◽

General Purpose ◽

General Purpose Processors ◽

Performance Computing

Download Full-text

Applications of Machine Learning and High-Performance Computing in the Era of COVID-19

Applied System Innovation ◽

10.3390/asi4030040 ◽

2021 ◽

Vol 4 (3) ◽

pp. 40

Author(s):

Abdul Majeed

Keyword(s):

Machine Learning ◽

High Performance Computing ◽

High Performance ◽

Low Cost ◽

General Purpose ◽

Prototype System ◽

Successful Implementation ◽

Privacy And Security ◽

Applications Of Machine Learning ◽

Performance Computing

During the ongoing pandemic of the novel coronavirus disease 2019 (COVID-19), latest technologies such as artificial intelligence (AI), blockchain, learning paradigms (machine, deep, smart, few short, extreme learning, etc.), high-performance computing (HPC), Internet of Medical Things (IoMT), and Industry 4.0 have played a vital role. These technologies helped to contain the disease’s spread by predicting contaminated people/places, as well as forecasting future trends. In this article, we provide insights into the applications of machine learning (ML) and high-performance computing (HPC) in the era of COVID-19. We discuss the person-specific data that are being collected to lower the COVID-19 spread and highlight the remarkable opportunities it provides for knowledge extraction leveraging low-cost ML and HPC techniques. We demonstrate the role of ML and HPC in the context of the COVID-19 era with the successful implementation or proposition in three contexts: (i) ML and HPC use in the data life cycle, (ii) ML and HPC use in analytics on COVID-19 data, and (iii) the general-purpose applications of both techniques in COVID-19’s arena. In addition, we discuss the privacy and security issues and architecture of the prototype system to demonstrate the proposed research. Finally, we discuss the challenges of the available data and highlight the issues that hinder the applicability of ML and HPC solutions on it.

Download Full-text

JURECA: General-purpose supercomputer at Jülich Supercomputing Centre

Journal of large-scale research facilities JLSRF ◽

10.17815/jlsrf-2-121 ◽

2016 ◽

Vol 2 ◽

Cited By ~ 138

Author(s):

Dorian Krause ◽

Philipp Thörnig

Keyword(s):

High Performance Computing ◽

Data Analytics ◽

High Performance ◽

General Purpose ◽

Cluster Architecture ◽

Performance Computing ◽

Selection Of

JURECA is a petaflop-scale, general-purpose supercomputer operated by Jülich Supercomputing Centre at Forschungszentrum Jülich. Utilizing a flexible cluster architecture based on T-Platforms V-Class blades and a balanced selection of best of its kind components the system supports a wide variety of high-performance computing and data analytics workloads and offers a low entrance barrier for new users.

Download Full-text

Chronos: A General Purpose Classical AMG Solver for High Performance Computing

SIAM Journal on Scientific Computing ◽

10.1137/21m1398586 ◽

2021 ◽

Vol 43 (5) ◽

pp. C335-C357

Author(s):

Giovanni Isotton ◽

Matteo Frigo ◽

Nicolò Spiezia ◽

Carlo Janna

Keyword(s):

High Performance Computing ◽

High Performance ◽

General Purpose ◽

Performance Computing

Download Full-text

Design of Reconfigurable Multi-Process or Architecture for High Performance Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.378.534 ◽

2013 ◽

Vol 378 ◽

pp. 534-538

Author(s):

Fan Zhang ◽

Xing Guo Luo ◽

Xing Ming Zhang

Keyword(s):

High Performance Computing ◽

High Speed ◽

High Performance ◽

General Purpose ◽

Parallel Execution ◽

Storage Unit ◽

Storage Structure ◽

Tasks Scheduling ◽

Configuration System ◽

Performance Computing

In this paper, the design used Reconfigurable multi-processor Architecture (Reconfigurable Multi - Processors Architecture, RCMPA), the system can adapt to a variety of applications, through the multi-processor parallel execution and flexible configuration system. At the same time, each computing components in the system constitute by the general microprocessor, the reconfiguration of FPGA and SRAMs. General purpose microprocessor can realize the control of a variety of tasks, scheduling, and some computing functions. FPGA can offer sufficient flexibility, extensibility and high-speed Internet features. SRAMs can offer all kinds of storage structure of high-speed read and write speed and high density storage unit.

Download Full-text

Flynn’s Reconciliation

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3458357 ◽

2021 ◽

Vol 18 (3) ◽

pp. 1-26

Author(s):

Daniel Thuerck ◽

Nicolas Weber ◽

Roberto Bifulco

Keyword(s):

Machine Learning ◽

High Performance Computing ◽

High Performance ◽

General Purpose ◽

Intermediate Representation ◽

Single Source ◽

Architecture Model ◽

Mapping Process ◽

Performance Computing ◽

Accelerator Architectures

A large portion of the recent performance increase in the High Performance Computing (HPC) and Machine Learning (ML) domains is fueled by accelerator cards. Many popular ML frameworks support accelerators by organizing computations as a computational graph over a set of highly optimized, batched general-purpose kernels. While this approach simplifies the kernels’ implementation for each individual accelerator, the increasing heterogeneity among accelerator architectures for HPC complicates the creation of portable and extensible libraries of such kernels. Therefore, using a generalization of the CUDA community’s warp register cache programming idiom, we propose a new programming idiom (CoRe) and a virtual architecture model (PIRCH), abstracting over SIMD and SIMT paradigms. We define and automate the mapping process from a single source to PIRCH’s intermediate representation and develop backends that issue code for three different architectures: Intel AVX512, NVIDIA GPUs, and NEC SX-Aurora. Code generated by our source-to-source compiler for batched kernels, borG, competes favorably with vendor-tuned libraries and is up to 2× faster than hand-tuned kernels across architectures.

Download Full-text

Aspects of GPU for general purpose high performance computing

2009 Asia and South Pacific Design Automation Conference ◽

10.1109/aspdac.2009.4796483 ◽

2009 ◽

Cited By ~ 9

Author(s):

Reiji Suda ◽

Takayuki Aoki ◽

Shoichi Hirasawa ◽

Akira Nukada ◽

Hiroki Honda ◽

...

Keyword(s):

High Performance Computing ◽

High Performance ◽

General Purpose ◽

Performance Computing

Download Full-text

Efficient Software Architecture Pattern for Accelerator Based Computing

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1016.0394s220 ◽

2020 ◽

Vol 9 (4S2) ◽

pp. 64-69

Keyword(s):

Software Architecture ◽

High Performance Computing ◽

High Performance ◽

Low Cost ◽

General Purpose ◽

Application Software ◽

Gpgpu Computing ◽

Architecture Pattern ◽

Performance Computing ◽

Efficient Software

Graphics Accelerators are increasingly used for general purpose high performance computing applications as they provide a low cost solution to high performance computing requirements. Intel also came out with a performance accelerator that offers a similar solution. However, the existing application software needs to be restructured to suit to the accelerator paradigm with a suitable software architecture pattern. In the present work, master-slave architecture is employed to convert CFD grid free Euler solvers in CUDA for GPGPU computing. The performance obtained using master-slave architecture for GPGPU computing is compared with that of sequential computing results.

Download Full-text

A Web-Based Graphical Interface for General-Purpose High-Performance Computing Clusters

Parallel and Distributed Processing and Applications - Lecture Notes in Computer Science ◽

10.1007/3-540-37619-4_7 ◽

2003 ◽

pp. 44-52

Author(s):

Bing Bing Zhou ◽

B. McKenzie ◽

Andrew Hodgson

Keyword(s):

High Performance Computing ◽

High Performance ◽

General Purpose ◽

Graphical Interface ◽

Web Based ◽

Performance Computing

Download Full-text

CUDA or OpenCL

Research Advances in the Integration of Big Data and Smart Computing - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-4666-8737-0.ch015 ◽

2016 ◽

pp. 267-279

Author(s):

Mayank Bhura ◽

Pranav H. Deshpande ◽

K. Chandrasekaran

Keyword(s):

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Heterogeneous Systems ◽

General Purpose ◽

Programming Environment ◽

Pros And Cons ◽

Nvidia Gpu ◽

Graphics Processing ◽

Performance Computing

Usage of General Purpose Graphics Processing Units (GPGPUs) in high-performance computing is increasing as heterogeneous systems continue to become dominant. CUDA had been the programming environment for nearly all such NVIDIA GPU based GPGPU applications. Still, the framework runs only on NVIDIA GPUs, for other frameworks it requires reimplementation to utilize additional computing devices that are available. OpenCL provides a vendor-neutral and open programming environment, with many implementations available on CPUs, GPUs, and other types of accelerators, OpenCL can thus be regarded as write once, run anywhere framework. Despite this, both frameworks have their own pros and cons. This chapter presents a comparison of the performance of CUDA and OpenCL frameworks, using an algorithm to find the sum of all possible triple products on a list of integers, implemented on GPUs.

Download Full-text