scholarly journals A General-Purpose AMG Linear Solver for High Performance Computing

Author(s):  
G. Isotton ◽  
M. Frigo ◽  
N. Spiezia ◽  
S. Koric ◽  
Q. Lu ◽  
...  
2021 ◽  
Vol 4 (3) ◽  
pp. 40
Author(s):  
Abdul Majeed

During the ongoing pandemic of the novel coronavirus disease 2019 (COVID-19), latest technologies such as artificial intelligence (AI), blockchain, learning paradigms (machine, deep, smart, few short, extreme learning, etc.), high-performance computing (HPC), Internet of Medical Things (IoMT), and Industry 4.0 have played a vital role. These technologies helped to contain the disease’s spread by predicting contaminated people/places, as well as forecasting future trends. In this article, we provide insights into the applications of machine learning (ML) and high-performance computing (HPC) in the era of COVID-19. We discuss the person-specific data that are being collected to lower the COVID-19 spread and highlight the remarkable opportunities it provides for knowledge extraction leveraging low-cost ML and HPC techniques. We demonstrate the role of ML and HPC in the context of the COVID-19 era with the successful implementation or proposition in three contexts: (i) ML and HPC use in the data life cycle, (ii) ML and HPC use in analytics on COVID-19 data, and (iii) the general-purpose applications of both techniques in COVID-19’s arena. In addition, we discuss the privacy and security issues and architecture of the prototype system to demonstrate the proposed research. Finally, we discuss the challenges of the available data and highlight the issues that hinder the applicability of ML and HPC solutions on it.


Author(s):  
Dorian Krause ◽  
Philipp Thörnig

JURECA is a petaflop-scale, general-purpose supercomputer operated by Jülich Supercomputing Centre at Forschungszentrum Jülich. Utilizing a flexible cluster architecture based on T-Platforms V-Class blades and a balanced selection of best of its kind components the system supports a wide variety of high-performance computing and data analytics workloads and offers a low entrance barrier for new users.


2021 ◽  
Vol 43 (5) ◽  
pp. C335-C357
Author(s):  
Giovanni Isotton ◽  
Matteo Frigo ◽  
Nicolò Spiezia ◽  
Carlo Janna

2013 ◽  
Vol 378 ◽  
pp. 534-538
Author(s):  
Fan Zhang ◽  
Xing Guo Luo ◽  
Xing Ming Zhang

In this paper, the design used Reconfigurable multi-processor Architecture (Reconfigurable Multi - Processors Architecture, RCMPA), the system can adapt to a variety of applications, through the multi-processor parallel execution and flexible configuration system. At the same time, each computing components in the system constitute by the general microprocessor, the reconfiguration of FPGA and SRAMs. General purpose microprocessor can realize the control of a variety of tasks, scheduling, and some computing functions. FPGA can offer sufficient flexibility, extensibility and high-speed Internet features. SRAMs can offer all kinds of storage structure of high-speed read and write speed and high density storage unit.


2021 ◽  
Vol 18 (3) ◽  
pp. 1-26
Author(s):  
Daniel Thuerck ◽  
Nicolas Weber ◽  
Roberto Bifulco

A large portion of the recent performance increase in the High Performance Computing (HPC) and Machine Learning (ML) domains is fueled by accelerator cards. Many popular ML frameworks support accelerators by organizing computations as a computational graph over a set of highly optimized, batched general-purpose kernels. While this approach simplifies the kernels’ implementation for each individual accelerator, the increasing heterogeneity among accelerator architectures for HPC complicates the creation of portable and extensible libraries of such kernels. Therefore, using a generalization of the CUDA community’s warp register cache programming idiom, we propose a new programming idiom (CoRe) and a virtual architecture model (PIRCH), abstracting over SIMD and SIMT paradigms. We define and automate the mapping process from a single source to PIRCH’s intermediate representation and develop backends that issue code for three different architectures: Intel AVX512, NVIDIA GPUs, and NEC SX-Aurora. Code generated by our source-to-source compiler for batched kernels, borG, competes favorably with vendor-tuned libraries and is up to 2× faster than hand-tuned kernels across architectures.


Author(s):  
Reiji Suda ◽  
Takayuki Aoki ◽  
Shoichi Hirasawa ◽  
Akira Nukada ◽  
Hiroki Honda ◽  
...  

Graphics Accelerators are increasingly used for general purpose high performance computing applications as they provide a low cost solution to high performance computing requirements. Intel also came out with a performance accelerator that offers a similar solution. However, the existing application software needs to be restructured to suit to the accelerator paradigm with a suitable software architecture pattern. In the present work, master-slave architecture is employed to convert CFD grid free Euler solvers in CUDA for GPGPU computing. The performance obtained using master-slave architecture for GPGPU computing is compared with that of sequential computing results.


Author(s):  
Mayank Bhura ◽  
Pranav H. Deshpande ◽  
K. Chandrasekaran

Usage of General Purpose Graphics Processing Units (GPGPUs) in high-performance computing is increasing as heterogeneous systems continue to become dominant. CUDA had been the programming environment for nearly all such NVIDIA GPU based GPGPU applications. Still, the framework runs only on NVIDIA GPUs, for other frameworks it requires reimplementation to utilize additional computing devices that are available. OpenCL provides a vendor-neutral and open programming environment, with many implementations available on CPUs, GPUs, and other types of accelerators, OpenCL can thus be regarded as write once, run anywhere framework. Despite this, both frameworks have their own pros and cons. This chapter presents a comparison of the performance of CUDA and OpenCL frameworks, using an algorithm to find the sum of all possible triple products on a list of integers, implemented on GPUs.


Sign in / Sign up

Export Citation Format

Share Document