A comparison of message passing and shared memory architectures for data parallel programs

A. C. Klaiber; H. M. Levy

doi:10.1145/192007.192020

A comparison of message passing and shared memory architectures for data parallel programs

Proceedings of 21 International Symposium on Computer Architecture ◽

10.1109/isca.1994.288158 ◽

2002 ◽

Cited By ~ 7

Author(s):

A.C. Klaiber ◽

H.M. Levy

Keyword(s):

Shared Memory ◽

Message Passing ◽

Parallel Programs ◽

Data Parallel ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text

Extending OpenMP for NUMA Machines

Scientific Programming ◽

10.1155/2000/464182 ◽

2000 ◽

Vol 8 (3) ◽

pp. 163-181 ◽

Cited By ~ 16

Author(s):

John Bircsak ◽

Peter Craig ◽

RaeLyn Crowell ◽

Zarka Cvetanovic ◽

Jonathan Harris ◽

...

Keyword(s):

Shared Memory ◽

High Performance ◽

Distributed Memory ◽

Parallel Programs ◽

Compiler Optimizations ◽

High Performance Fortran ◽

Efficient Code ◽

Memory Architectures ◽

Shared Memory Architectures ◽

Fast Access

This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines used to write portable parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations. OpenMP -- designed for shared-memory architectures -- does not by itself address these issues. The extensions to OpenMP Fortran presented here have been mainly taken from High Performance Fortran. The paper describes some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions. It also describes some additional compiler optimizations, and concludes with some preliminary results.

Download Full-text

Parallel Array Classes and Lightweight Sharing Mechanisms

Scientific Programming ◽

10.1155/1993/393409 ◽

1993 ◽

Vol 2 (4) ◽

pp. 203-216

Author(s):

Steve W. Otto

Keyword(s):

Finite Element Method ◽

Shared Memory ◽

Message Passing ◽

Distributed Memory ◽

Programming Model ◽

Memory Usage ◽

Particle In Cell ◽

Parallel Array ◽

Memory Architectures ◽

Shared Memory Architectures

We discuss a set of parallel array classes, MetaMP, for distributed-memory architectures. The classes are implemented in C++ and interface to the PVM or Intel NX message-passing systems. An array class implements a partitioned array as a set of objects distributed across the nodes – a "collective" object. Object methods hide the low-level message-passing and implement meaningful array operations. These include transparent guard strips (or sharing regions) that support finite-difference stencils, reductions and multibroadcasts for support of pivoting and row operations, and interpolation/contraction operations for support of multigrid algorithms. The concept of guard strips is generalized to an object implementation of lightweight sharing mechanisms for finite element method (FEM) and particle-in-cell (PIC) algorithms. The sharing is accomplished through the mechanism of weak memory coherence and can be efficiently implemented. The price of the efficient implementation is memory usage and the need to explicitly specify the coherence operations. An intriguing feature of this programming model is that it maps well to both distributed-memory and shared-memory architectures.

Download Full-text

Wait-Free Message Passing Protocol for Non-coherent Shared Memory Architectures

Recent Advances in the Message Passing Interface - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33518-1_19 ◽

2012 ◽

pp. 142-152

Author(s):

Isaías A. Comprés Ureña ◽

Michael Gerndt ◽

Carsten Trinitis

Keyword(s):

Shared Memory ◽

Message Passing ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text

A Message-Passing Microcoded Synchronization for Distributed Shared Memory Architectures

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2018.2834423 ◽

2019 ◽

Vol 38 (5) ◽

pp. 975-979

Author(s):

Zois-Gerasimos Tasoulas ◽

Iraklis Anagnostopoulos ◽

Lazaros Papadopoulos ◽

Dimitrios Soudris

Keyword(s):

Shared Memory ◽

Message Passing ◽

Distributed Shared Memory ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text

HPF LIBRARY AND COMPILER SUPPORT FOR HALOS IN DATA PARALLEL IRREGULAR COMPUTATIONS

Parallel Processing Letters ◽

10.1142/s0129626400000196 ◽

2000 ◽

Vol 10 (02n03) ◽

pp. 189-200 ◽

Cited By ~ 1

Author(s):

THOMAS BRANDES

Keyword(s):

Message Passing ◽

High Performance ◽

Parallel Programs ◽

Address Space ◽

Compiler Support ◽

Data Parallel ◽

High Performance Fortran ◽

Non Local ◽

Performance Results ◽

Memory Architectures

On distributed memory architectures data parallel compilers emulate the global address space by distributing the data onto the processors according to the mapping directives of the user and by generating automatically explicit inter-processor communication. A shadow is additionally allocated local memory to keep on one processor also non-local values of the data that is accessed or defined by this processor. While shadow edges are already well studied for structured grids, this paper focuses on its use for applications with unstructured grids where updates on the shadow edges involve unstructured communication with complex communication schedules. The use of shadow edges is considered for High Performance Fortran (HPF) as the de facto standard language for writing data parallel programs in Fortran. A library with a HPF binding provides the explicit control of unstructured shadows and their communication schedules, also called halos. This halo library allows writing HPF programs with a performance close to hand-coded message-passing versions but where the user is freed of the burden to calculate shadow sizes and communication schedules and to do the exchanging of data with explicit message passing commands. In certain situations, the HPF compiler can create and use halos automatically. This paper shows the advantages and also the limits of this approach. The halo library and an automatic support of halos have been implemented within the ADAPTOR HPF compilation system. The performance results verify the effectiveness of the chosen approach.

Download Full-text

Realising a concurrent object-based programming model on parallel virtual shared memory architectures

Programming Models for Massively Parallel Computers ◽

10.1109/pmmpc.1995.504345 ◽

2002 ◽

Cited By ~ 1

Author(s):

M. Fisher ◽

J. Keane

Keyword(s):

Shared Memory ◽

Programming Model ◽

Object Based ◽

Virtual Shared Memory ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text

Analytic evaluation of shared-memory architectures

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2003.1178880 ◽

2003 ◽

Vol 14 (2) ◽

pp. 166-180 ◽

Cited By ~ 6

Author(s):

D.J. Sorin ◽

J.L. Lemon ◽

D.L. Eager ◽

M.K. Vernon

Keyword(s):

Shared Memory ◽

Analytic Evaluation ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text

Adaptive software cache management for distributed shared memory architectures

[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture ◽

10.1109/isca.1990.134515 ◽

2002 ◽

Cited By ~ 37

Author(s):

J.K. Bennett ◽

J.B. Carter ◽

W. Zwaenepoel

Keyword(s):

Shared Memory ◽

Distributed Shared Memory ◽

Cache Management ◽

Adaptive Software ◽

Memory Architectures ◽

Shared Memory Architectures ◽

Software Cache

Download Full-text

Or-Parallel Prolog on Distributed Shared-Memory Architectures

Implementations of Logic Programming Systems ◽

10.1007/978-1-4615-2690-2_14 ◽

1994 ◽

pp. 203-215

Author(s):

Fernando M. A. Silva

Keyword(s):

Shared Memory ◽

Distributed Shared Memory ◽

Memory Architectures ◽

Shared Memory Architectures

Download Full-text