sequential code
Recently Published Documents


TOTAL DOCUMENTS

49
(FIVE YEARS 2)

H-INDEX

8
(FIVE YEARS 0)

2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-30
Author(s):  
Kevin De Porre ◽  
Carla Ferreira ◽  
Nuno Preguiça ◽  
Elisa Gonzalez Boix

To ease the development of geo-distributed applications, replicated data types (RDTs) offer a familiar programming interface while ensuring state convergence, low latency, and high availability. However, RDTs are still designed exclusively by experts using ad-hoc solutions that are error-prone and result in brittle systems. Recent works statically detect conflicting operations on existing data types and coordinate those at runtime to guarantee convergence and preserve application invariants. However, these approaches are too conservative, imposing coordination on a large number of operations. In this work, we propose a principled approach to design and implement efficient RDTs taking into account application invariants. Developers extend sequential data types with a distributed specification, which together form an RDT. We statically analyze the specification to detect conflicts and unravel their cause. This information is then used at runtime to serialize concurrent operations safely and efficiently. Our approach derives a correct RDT from any sequential data type without changes to the data type's implementation and with minimal coordination. We implement our approach in Scala and develop an extensive portfolio of RDTs. The evaluation shows that our approach provides performance similar to conflict-free replicated data types for commutative operations, and considerably improves the performance of non-commutative operations, compared to existing solutions.


2021 ◽  
Vol 40 ◽  
pp. 02005
Author(s):  
Ashish A. Jadhav ◽  
Abhijeet D. Kalamkar ◽  
Pritish A. Gaikwad ◽  
Vishwesh Vyawahare ◽  
Navin Singhaniya

This paper deals with GPU computing of special mathematical functions that are used in Fractional Calculus. The graphics processing unit (GPU) has grown to be an integral part of nowadays’s mainstream computing structures. The special mathematical functions are an integral part of Fractional Calculus. This paper deals with a novel parallel approach for computing special mathematical functions used in Fractional Calculus. NVIDIA’s GPU hardware is used to speed up the parallel algorithm. A comparison of the sequential code, vectorized code and GPU code is performed. We have successfully reduced the computation time of special mathematical functions using the parallel computing capabilities of GPU.


2020 ◽  
Vol 15 (177) ◽  
pp. 05-10
Author(s):  
L. M. BORGES ◽  
D. M. TAVARES ◽  
S. J. BACHEGA

2020 ◽  
Vol 48 (4) ◽  
pp. 583-602
Author(s):  
Christopher Brown ◽  
Vladimir Janjic ◽  
M. Goli ◽  
J. McCall

Abstract This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a combination of algorithmic skeletons (such as farms and pipelines), Monte–Carlo tree search for deriving mappings of tasks to available hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scalable speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the speedups obtained by mappings derived by the MCTS algorithm are within 5–15% of the best-obtained manual parallelisation.


Author(s):  
Masahiro Nakao ◽  
Tetsuya Odajima ◽  
Hitoshi Murai ◽  
Akihiro Tabuchi ◽  
Norihisa Fujita ◽  
...  

Accelerated clusters, which are cluster systems equipped with accelerators, are one of the most common systems in parallel computing. In order to exploit the performance of such systems, it is important to reduce communication latency between accelerator memories. In addition, there is also a need for a programming language that facilitates the maintenance of high performance by such systems. The goal of the present article is to evaluate XcalableACC (XACC), a parallel programming language, with tightly coupled accelerators/InfiniBand (TCAs/IB) hybrid communication on an accelerated cluster. TCA/IB hybrid communication is a combination of low-latency communication with TCA and high bandwidth with IB. The XACC language, which is a directive-based language for accelerated clusters, enables programmers to use TCA/IB hybrid communication with ease. In order to evaluate the performance of XACC with TCA/IB hybrid communication, we implemented the lattice quantum chromodynamics (LQCD) mini-application and evaluated the application on our accelerated cluster using up to 64 compute nodes. We also implemented the LQCD mini-application using a combination of CUDA and MPI (CUDA + MPI) and that of OpenACC and MPI (OpenACC + MPI) for comparison with XACC. Performance evaluation revealed that the performance of XACC with TCA/IB hybrid communication is 9% better than that of CUDA + MPI and 18% better than that of OpenACC + MPI. Furthermore, the performance of XACC was found to further increase by 7% by new expansion to XACC. Productivity evaluation revealed that XACC requires much less change from the serial LQCD code to implement the parallel LQCD code than CUDA + MPI and OpenACC + MPI. Moreover, since XACC can perform parallelization while maintaining the sequential code image, XACC is highly readable and shows excellent portability due to its directive-based approach.


Author(s):  
Khalid Alsubhi ◽  
Fawaz Alsolami ◽  
Abdullah Algarni ◽  
Kamal Jambi ◽  
Fathy Eassa ◽  
...  
Keyword(s):  

Author(s):  
Afif J. Almghawish ◽  
◽  
Ayman M. Abdalla ◽  
Ahmad B. Marzouq ◽  
◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document