An Algebraic Machinery for Optimizing Data Motion for HPF

Jan-Jan Wu; Marina C. Chen

doi:10.1155/1997/790426

The DEC High Performance Fortran 90 compiler front end

Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation ◽

10.1109/fmpc.1995.380464 ◽

2002 ◽

Cited By ~ 2

Author(s):

D.B. Loveman

Keyword(s):

High Performance ◽

High Performance Fortran ◽

Front End ◽

Fortran 90

Download Full-text

The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

Scientific Programming ◽

10.1155/1995/278064 ◽

1995 ◽

Vol 4 (1) ◽

pp. 1-21 ◽

Cited By ~ 3

Author(s):

Matthew O'keefe ◽

Terence Parr ◽

B. Kevin Edgar ◽

Steve Anderson ◽

Paul Woodward ◽

...

Keyword(s):

High Performance ◽

Parallel Machines ◽

Parallel Processors ◽

Massively Parallel ◽

Automatic Translation ◽

Efficient Code ◽

Self Similar ◽

User Friendly ◽

Application Codes ◽

Fortran 77

Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.

Download Full-text

Performance Issues in High Performance Fortran Implementations of Sensor-Based Applications

Scientific Programming ◽

10.1155/1997/372831 ◽

1997 ◽

Vol 6 (1) ◽

pp. 59-72 ◽

Cited By ~ 1

Author(s):

David R. O'hallaron ◽

Jon Webb ◽

Jaspal Subhlok

Keyword(s):

High Performance ◽

Parallel Machines ◽

Radar Imaging ◽

Synthetic Aperture ◽

Application Domain ◽

Resonance Imaging ◽

High Performance Fortran ◽

Intel Paragon ◽

Independent Loops ◽

Tracking Radar

Applications that get their inputs from sensors are an important and often overlooked application domain for High Performance Fortran (HPF). Such sensor-based applications typically perform regular operations on dense arrays, and often have latency and through put requirements that can only be achieved with parallel machines. This article describes a study of sensor-based applications, including the fast Fourier transform, synthetic aperture radar imaging, narrowband tracking radar processing, multibaseline stereo imaging, and medical magnetic resonance imaging. The applications are written in a dialect of HPF developed at Carnegie Mellon, and are compiled by the Fx compiler for the Intel Paragon. The main results of the study are that (1) it is possible to realize good performance for realistic sensor-based applications written in HPF and (2) the performance of the applications is determined by the performance of three core operations: independent loops (i.e., loops with no dependences between iterations), reductions, and index permutations. The article discusses the implications for HPF implementations and introduces some simple tests that implementers and users can use to measure the efficiency of the loops, reductions, and index permutations generated by an HPF compiler.

Download Full-text

Design and Performance Analysis of a Massively Parallel Atmospheric General Circulation Model

Scientific Programming ◽

10.1155/2000/371012 ◽

2000 ◽

Vol 8 (1) ◽

pp. 49-57 ◽

Cited By ~ 3

Author(s):

Daniel S. Schaffer ◽

Max J. Suárez

Keyword(s):

General Circulation Model ◽

General Circulation ◽

High Performance ◽

Degrees Of Freedom ◽

Parallel Machines ◽

Atmospheric General Circulation Model ◽

Circulation Model ◽

Massively Parallel ◽

Atmospheric General Circulation ◽

And Performance

In the 1990's, computer manufacturers are increasingly turning to the development of parallel processor machines to meet the high performance needs of their customers. Simultaneously, atmospheric scientists studying weather and climate phenomena ranging from hurricanes to El Niño to global warming require increasingly fine resolution models. Here, implementation of a parallel atmospheric general circulation model (GCM) which exploits the power of massively parallel machines is described. Using the horizontal data domain decomposition methodology, this FORTRAN 90 model is able to integrate a 0.6° longitude by 0.5° latitude problem at a rate of 19 Gigaflops on 512 processors of a Cray T3E 600; corresponding to 280 seconds of wall-clock time per simulated model day. At this resolution, the model has 64 times as many degrees of freedom and performs 400 times as many floating point operations per simulated day as the model it replaces.

Download Full-text

PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

Scientific Programming ◽

10.1155/1997/705102 ◽

1997 ◽

Vol 6 (1) ◽

pp. 29-40 ◽

Cited By ~ 9

Author(s):

Zeki Bozkus ◽

Larry Meadows ◽

Steven Nakamoto ◽

Vincent Schuster ◽

Mark Young

Keyword(s):

High Performance ◽

Distributed Memory ◽

Parallel Machines ◽

High Efficiency ◽

Memory Systems ◽

Production Quality ◽

Distributed Memory Machines ◽

High Performance Fortran ◽

Application Developers ◽

Efficient Software

High Performance Fortran (HPF) is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

Download Full-text

Prototyping Fortran-90 compilers for massively parallel machines

Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation - PLDI '92 ◽

10.1145/143095.143122 ◽

1992 ◽

Cited By ~ 8

Author(s):

Marina Chen ◽

James Cowie

Keyword(s):

Parallel Machines ◽

Massively Parallel ◽

Fortran 90

Download Full-text

An Introduction to High Performance Fortran

Scientific Programming ◽

10.1155/1995/612973 ◽

1995 ◽

Vol 4 (2) ◽

pp. 87-113 ◽

Cited By ~ 11

Author(s):

John Merlin ◽

Anthony Hey

Keyword(s):

Parallel Computation ◽

High Performance ◽

Data Distribution ◽

Parallel Architectures ◽

Data Parallel ◽

High Performance Fortran ◽

Concurrent Execution ◽

Fortran 90

High Performance Fortran (HPF) is an informal standard for extensions to Fortran 90 to assist its implementation on parallel architectures, particularly for data-parallel computation. Among other things, it includes directives for specifying data distribution across multiple memories, and concurrent execution features. This article provides a tutorial introduction to the main features of HPF.

Download Full-text

The physical structure of concurrent problems and concurrent computers

Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences ◽

10.1098/rsta.1988.0096 ◽

1988 ◽

Vol 326 (1591) ◽

pp. 411-444 ◽

Cited By ~ 11

Keyword(s):

High Performance ◽

Parallel Machines ◽

Temporal Structure ◽

Physical Structure ◽

Massively Parallel ◽

Single Instruction Multiple Data ◽

Multiple Data ◽

Network Methods ◽

Particle Process ◽

Physical Analogy

We introduce a physical analogy to describe problems and high-performance concurrent computers on which they are run. We show that the spatial characteristics of problems lead to their parallelism and review the lessons from use of the early hypercubes and a natural particle-process analogy. We generalize this picture to include the temporal structure of problems and show how this allows us to unify distributed, shared and hierarchical memories as well as SIMD (single instruction multiple data) architectures. We also show how neural network methods can be used to analyse a general formalism based on interacting strings and these lead to possible real-time schedulers and decomposers for massively parallel machines.

Download Full-text

Prototyping Fortran-90 compilers for massively parallel machines

ACM SIGPLAN Notices ◽

10.1145/143103.143122 ◽

1992 ◽

Vol 27 (7) ◽

pp. 94-105 ◽

Cited By ~ 2

Author(s):

Marina Chen ◽

James Cowie

Keyword(s):

Parallel Machines ◽

Massively Parallel ◽

Fortran 90

Download Full-text

High Performance Fortran: A Practical Analysis

Scientific Programming ◽

10.1155/1994/150306 ◽

1994 ◽

Vol 3 (3) ◽

pp. 187-199 ◽

Cited By ~ 7

Author(s):

Allan Knies ◽

Matthew O'keefe ◽

Tom Macdonald

Keyword(s):

High Performance ◽

Parallel Machines ◽

Production Quality ◽

Efficient Production ◽

Data Parallel ◽

High Performance Fortran ◽

Multiple Data ◽

Computing Industry ◽

Application Developers ◽

Important Design

The recently released high performance Fortran forum (HPFF) proposal has stirred much interest in the high performance computing industry. HPFF's most important design goal is to create a language that has source code portability and that achieves high performance on single instruction multiple data (SIMD), distributed-memory multiple instruction multiple data (MIMD), and shared-memory MIMD architectures. The HPFF proposal brings to the forefront many questions about design of portable and efficient languages for parallel machines. In this article, we discuss issues that need to be addressed before an efficient production quality compiler will be available for any such language. We examine some specific issues that are related to HPF's model of computation and analyze several implementation issues. We also provide some results from another data parallel compiler to help gain insight on some of the implementation issues that are relevant to HPF. Finally, we provide a summary of options currently available for application developers in industry.

Download Full-text