OpenCL Performance Evaluation on Modern Multicore CPUs

Scientific Programming ◽

10.1155/2015/859491 ◽

2015 ◽

Vol 2015 ◽

pp. 1-20 ◽

Cited By ~ 3

Author(s):

Joo Hwan Lee ◽

Nimit Nigania ◽

Hyesoon Kim ◽

Kaushik Patel ◽

Hyojong Kim

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Data Locality ◽

Instruction Level Parallelism ◽

Performance Variation ◽

Parallel Programming Model ◽

Data Location ◽

Space Data ◽

Level Parallelism ◽

Multicore Cpus

Utilizing heterogeneous platforms for computation has become a general trend, making the portability issue important. OpenCL (Open Computing Language) serves this purpose by enabling portable execution on heterogeneous architectures. However, unpredictable performance variation on different platforms has become a burden for programmers who write OpenCL applications. This is especially true for conventional multicore CPUs, since the performance of general OpenCL applications on CPUs lags behind the performance of their counterparts written in the conventional parallel programming model for CPUs. In this paper, we evaluate the performance of OpenCL applications on out-of-order multicore CPUs from the architectural perspective. We evaluate OpenCL applications on various aspects, including API overhead, scheduling overhead, instruction-level parallelism, address space, data location, data locality, and vectorization, comparing OpenCL to conventional parallel programming models for CPUs. Our evaluation indicates unique performance characteristics of OpenCL applications and also provides insight into the optimization metrics for better performance on CPUs.

Download Full-text

MapReduce Parallel Programming Model: A State-of-the-Art Survey

International Journal of Parallel Programming ◽

10.1007/s10766-015-0395-0 ◽

2015 ◽

Vol 44 (4) ◽

pp. 832-866 ◽

Cited By ~ 24

Author(s):

Ren Li ◽

Haibo Hu ◽

Heng Li ◽

Yunsong Wu ◽

Jianxi Yang

Keyword(s):

Parallel Programming ◽

Programming Model ◽

State Of The Art ◽

Parallel Programming Model

Download Full-text

Parallel programming model for the Epiphany many-core coprocessor using threaded MPI

Microprocessors and Microsystems ◽

10.1016/j.micpro.2016.02.006 ◽

2016 ◽

Vol 43 ◽

pp. 95-103 ◽

Cited By ~ 5

Author(s):

James A. Ross ◽

David A. Richie ◽

Song J. Park ◽

Dale R. Shires

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model ◽

Many Core

Download Full-text

2D-FMFI SAR application on HPC architectures with OmpSs parallel programming model

2012 NASA/ESA Conference on Adaptive Hardware and Systems (AHS) ◽

10.1109/ahs.2012.6268638 ◽

2012 ◽

Author(s):

Fisnik Kraja ◽

Arndt Bode ◽

Xavier Martorell

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model

Download Full-text

Interaction with the User in the SAPFOR System

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2021-24-1-157-183 ◽

2021 ◽

Vol 24 (1) ◽

pp. 157-183

Author(s):

Никита Андреевич Катаев

Keyword(s):

Parallel Programming ◽

Program Transformation ◽

Heterogeneous Computing ◽

Programming Model ◽

Parallel Programs ◽

Parallel Program ◽

Program Parallelization ◽

Parallel Programming Model ◽

The One ◽

High Level

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.

Download Full-text

Actors as a parallel programming model

STACS 91 - Lecture Notes in Computer Science ◽

10.1007/bfb0020798 ◽

2005 ◽

pp. 184-195 ◽

Cited By ~ 5

Author(s):

Françoise Baude ◽

Guy Vidal-Naquet

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model

Download Full-text

A practical parallel programming model

Specification of Parallel Algorithms - DIMACS Series in Discrete Mathematics and Theoretical Computer Science ◽

10.1090/dimacs/018/11 ◽

1994 ◽

pp. 143-160 ◽

Cited By ~ 1

Author(s):

Lawrence Snyder

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model

Download Full-text

Toward An Architecture Independent High Level Parallel Programming Model For Artificial Intelligence

Parallel Processing for Artificial Intelligence - Machine Intelligence and Pattern Recognition ◽

10.1016/b978-0-444-81837-9.50009-9 ◽

1994 ◽

pp. 57-66

Author(s):

Mark S. BERLIN

Keyword(s):

Artificial Intelligence ◽

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model ◽

High Level

Download Full-text

Executing linear algebra kernels in heterogeneous distributed infrastructures with PyCOMPSs

Oil & Gas Science and Technology – Revue d’IFP Energies nouvelles ◽

10.2516/ogst/2018047 ◽

2018 ◽

Vol 73 ◽

pp. 47 ◽

Cited By ~ 3

Author(s):

Ramon Amela ◽

Cristian Ramon-Cortes ◽

Jorge Ejarque ◽

Javier Conejero ◽

Rosa M. Badia

Keyword(s):

Programming Languages ◽

Linear Algebra ◽

Programming Model ◽

Xeon Phi ◽

Scientific Communities ◽

Heterogeneous Architectures ◽

Parallel Programming Model ◽

Significant Performance ◽

Thread Level Parallelism ◽

Level Parallelism

Python is a popular programming language due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. The adoption from multiple scientific communities has evolved in the emergence of a large number of libraries and modules, which has helped to put Python on the top of the list of the programming languages [1]. Task-based programming has been proposed in the recent years as an alternative parallel programming model. PyCOMPSs follows such approach for Python, and this paper presents its extensions to combine task-based parallelism and thread-level parallelism. Also, we present how PyCOMPSs has been adapted to support heterogeneous architectures, including Xeon Phi and GPUs. Results obtained with linear algebra benchmarks demonstrate that significant performance can be obtained with a few lines of Python.

Download Full-text