Efficient parallel programming on scalable shared memory systems with High Performance Fortran

Siegfried Benkner; Thomas Brandes

doi:10.1002/cpe.649

Efficient parallel programming on scalable shared memory systems with High Performance Fortran

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.649 ◽

2002 ◽

Vol 14 (8-9) ◽

pp. 789-803 ◽

Cited By ~ 3

Author(s):

Siegfried Benkner ◽

Thomas Brandes

Keyword(s):

Parallel Programming ◽

Shared Memory ◽

High Performance ◽

Memory Systems ◽

High Performance Fortran

Download Full-text

Invasive Computing on High Performance Shared Memory Systems

Facing the Multicore-Challenge III - Lecture Notes in Computer Science ◽

10.1007/978-3-642-35893-7_1 ◽

2013 ◽

pp. 1-12 ◽

Cited By ~ 4

Author(s):

Michael Bader ◽

Hans-Joachim Bungartz ◽

Martin Schreiber

Keyword(s):

Shared Memory ◽

High Performance ◽

Memory Systems

Download Full-text

Workload Decomposition Strategies for Shared Memory Parallel Systems with OpenMP

Scientific Programming ◽

10.1155/2001/891073 ◽

2001 ◽

Vol 9 (2-3) ◽

pp. 109-122 ◽

Cited By ~ 2

Author(s):

Beniamino Di Martino ◽

Sergio Briguglio ◽

Gregorio Vlad ◽

Giuliana Fogaccia

Keyword(s):

Programming Languages ◽

Parallel Programming ◽

Shared Memory ◽

Memory Systems ◽

Particle In Cell ◽

Decomposition Strategies ◽

Programming Effort ◽

High Level ◽

Memory Architectures

A crucial issue in parallel programming (both for distributed and shared memory architectures) is work decomposition. Work decomposition task can be accomplished without large programming effort with use of high-level parallel programming languages, such as OpenMP. Anyway particular care must still be payed on achieving performance goals. In this paper we introduce and compare two decomposition strategies, in the framework of shared memory systems, as applied to a case study particle in cell application. A number of different implementations of them, based on the OpenMP language, are discussed with regard to time efficiency, memory occupancy, and program restructuring effort.

Download Full-text

Extending OpenMP for NUMA Machines

Scientific Programming ◽

10.1155/2000/464182 ◽

2000 ◽

Vol 8 (3) ◽

pp. 163-181 ◽

Cited By ~ 16

Author(s):

John Bircsak ◽

Peter Craig ◽

RaeLyn Crowell ◽

Zarka Cvetanovic ◽

Jonathan Harris ◽

...

Keyword(s):

Shared Memory ◽

High Performance ◽

Distributed Memory ◽

Parallel Programs ◽

Compiler Optimizations ◽

High Performance Fortran ◽

Efficient Code ◽

Memory Architectures ◽

Shared Memory Architectures ◽

Fast Access

This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines used to write portable parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations. OpenMP -- designed for shared-memory architectures -- does not by itself address these issues. The extensions to OpenMP Fortran presented here have been mainly taken from High Performance Fortran. The paper describes some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions. It also describes some additional compiler optimizations, and concludes with some preliminary results.

Download Full-text

PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

Scientific Programming ◽

10.1155/1997/705102 ◽

1997 ◽

Vol 6 (1) ◽

pp. 29-40 ◽

Cited By ~ 9

Author(s):

Zeki Bozkus ◽

Larry Meadows ◽

Steven Nakamoto ◽

Vincent Schuster ◽

Mark Young

Keyword(s):

High Performance ◽

Distributed Memory ◽

Parallel Machines ◽

High Efficiency ◽

Memory Systems ◽

Production Quality ◽

Distributed Memory Machines ◽

High Performance Fortran ◽

Application Developers ◽

Efficient Software

High Performance Fortran (HPF) is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

Download Full-text

High Performance Air Pollution Simulation on Shared Memory Systems

High Performance Scientific and Engineering Computing ◽

10.1007/978-1-4757-5402-5_17 ◽

2004 ◽

pp. 253-266

Author(s):

María J. Martín ◽

Marta Parada ◽

Ramón Doallo

Keyword(s):

Air Pollution ◽

Shared Memory ◽

High Performance ◽

Memory Systems

Download Full-text

High Performance Tensor–Vector Multiplication on Shared-Memory Systems

Parallel Processing and Applied Mathematics - Lecture Notes in Computer Science ◽

10.1007/978-3-030-43229-4_4 ◽

2020 ◽

pp. 38-48

Author(s):

Filip Pawłowski ◽

Bora Uçar ◽

Albert-Jan Yzelman

Keyword(s):

Shared Memory ◽

High Performance ◽

Memory Systems

Download Full-text

Using high performance Fortran for parallel programming

Computers & Mathematics with Applications ◽

10.1016/s0898-1221(98)00095-9 ◽

1998 ◽

Vol 35 (12) ◽

pp. 41-57 ◽

Cited By ~ 7

Author(s):

G. Sarma ◽

T. Zacharia ◽

D. Miles

Keyword(s):

Parallel Programming ◽

High Performance ◽

High Performance Fortran

Download Full-text

Data parallel programming: The promises and limitations of high performance fortran

Parallel Computation - Lecture Notes in Computer Science ◽

10.1007/3-540-57314-3_10 ◽

1993 ◽

pp. 114-114

Author(s):

Piyush Mehrotra

Keyword(s):

Parallel Programming ◽

High Performance ◽

Data Parallel ◽

High Performance Fortran ◽

Data Parallel Programming

Download Full-text

POSH: Paris OpenSHMEM A High-performance OpenSHMEM Implementation for Shared Memory Systems

Procedia Computer Science ◽

10.1016/j.procs.2014.05.226 ◽

2014 ◽

Vol 29 ◽

pp. 2422-2431 ◽

Cited By ~ 2

Author(s):

Camille Coti

Keyword(s):

Shared Memory ◽

High Performance ◽

Memory Systems

Download Full-text

MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming

International Journal of Parallel Programming ◽

10.1007/s10766-013-0262-9 ◽

2013 ◽

Vol 42 (4) ◽

pp. 619-642 ◽

Cited By ~ 11

Author(s):

A. N. Yzelman ◽

R. H. Bisseling ◽

D. Roose ◽

K. Meerbergen

Keyword(s):

Parallel Programming ◽

Shared Memory ◽

High Performance

Download Full-text