Efficient parallel programming on scalable shared memory systems with High Performance Fortran

2002 ◽  
Vol 14 (8-9) ◽  
pp. 789-803 ◽  
Author(s):  
Siegfried Benkner ◽  
Thomas Brandes
2001 ◽  
Vol 9 (2-3) ◽  
pp. 109-122 ◽  
Author(s):  
Beniamino Di Martino ◽  
Sergio Briguglio ◽  
Gregorio Vlad ◽  
Giuliana Fogaccia

A crucial issue in parallel programming (both for distributed and shared memory architectures) is work decomposition. Work decomposition task can be accomplished without large programming effort with use of high-level parallel programming languages, such as OpenMP. Anyway particular care must still be payed on achieving performance goals. In this paper we introduce and compare two decomposition strategies, in the framework of shared memory systems, as applied to a case study particle in cell application. A number of different implementations of them, based on the OpenMP language, are discussed with regard to time efficiency, memory occupancy, and program restructuring effort.


2000 ◽  
Vol 8 (3) ◽  
pp. 163-181 ◽  
Author(s):  
John Bircsak ◽  
Peter Craig ◽  
RaeLyn Crowell ◽  
Zarka Cvetanovic ◽  
Jonathan Harris ◽  
...  

This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines used to write portable parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations. OpenMP -- designed for shared-memory architectures -- does not by itself address these issues. The extensions to OpenMP Fortran presented here have been mainly taken from High Performance Fortran. The paper describes some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions. It also describes some additional compiler optimizations, and concludes with some preliminary results.


1997 ◽  
Vol 6 (1) ◽  
pp. 29-40 ◽  
Author(s):  
Zeki Bozkus ◽  
Larry Meadows ◽  
Steven Nakamoto ◽  
Vincent Schuster ◽  
Mark Young

High Performance Fortran (HPF) is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.


2013 ◽  
Vol 42 (4) ◽  
pp. 619-642 ◽  
Author(s):  
A. N. Yzelman ◽  
R. H. Bisseling ◽  
D. Roose ◽  
K. Meerbergen

Sign in / Sign up

Export Citation Format

Share Document