scholarly journals An Algebraic Machinery for Optimizing Data Motion for HPF

1997 ◽  
Vol 6 (3) ◽  
pp. 297-325
Author(s):  
Jan-Jan Wu ◽  
Marina C. Chen

This paper describes a general compiler optimization technique that reduces communica tion over-head for FORTRAN-90 (and High Performance FORTRAN) implementations on massively parallel machines.

1995 ◽  
Vol 4 (1) ◽  
pp. 1-21 ◽  
Author(s):  
Matthew O'keefe ◽  
Terence Parr ◽  
B. Kevin Edgar ◽  
Steve Anderson ◽  
Paul Woodward ◽  
...  

Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.


1997 ◽  
Vol 6 (1) ◽  
pp. 59-72 ◽  
Author(s):  
David R. O'hallaron ◽  
Jon Webb ◽  
Jaspal Subhlok

Applications that get their inputs from sensors are an important and often overlooked application domain for High Performance Fortran (HPF). Such sensor-based applications typically perform regular operations on dense arrays, and often have latency and through put requirements that can only be achieved with parallel machines. This article describes a study of sensor-based applications, including the fast Fourier transform, synthetic aperture radar imaging, narrowband tracking radar processing, multibaseline stereo imaging, and medical magnetic resonance imaging. The applications are written in a dialect of HPF developed at Carnegie Mellon, and are compiled by the Fx compiler for the Intel Paragon. The main results of the study are that (1) it is possible to realize good performance for realistic sensor-based applications written in HPF and (2) the performance of the applications is determined by the performance of three core operations: independent loops (i.e., loops with no dependences between iterations), reductions, and index permutations. The article discusses the implications for HPF implementations and introduces some simple tests that implementers and users can use to measure the efficiency of the loops, reductions, and index permutations generated by an HPF compiler.


2000 ◽  
Vol 8 (1) ◽  
pp. 49-57 ◽  
Author(s):  
Daniel S. Schaffer ◽  
Max J. Suárez

In the 1990's, computer manufacturers are increasingly turning to the development of parallel processor machines to meet the high performance needs of their customers. Simultaneously, atmospheric scientists studying weather and climate phenomena ranging from hurricanes to El Niño to global warming require increasingly fine resolution models. Here, implementation of a parallel atmospheric general circulation model (GCM) which exploits the power of massively parallel machines is described. Using the horizontal data domain decomposition methodology, this FORTRAN 90 model is able to integrate a 0.6° longitude by 0.5° latitude problem at a rate of 19 Gigaflops on 512 processors of a Cray T3E 600; corresponding to 280 seconds of wall-clock time per simulated model day. At this resolution, the model has 64 times as many degrees of freedom and performs 400 times as many floating point operations per simulated day as the model it replaces.


1997 ◽  
Vol 6 (1) ◽  
pp. 29-40 ◽  
Author(s):  
Zeki Bozkus ◽  
Larry Meadows ◽  
Steven Nakamoto ◽  
Vincent Schuster ◽  
Mark Young

High Performance Fortran (HPF) is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.


1995 ◽  
Vol 4 (2) ◽  
pp. 87-113 ◽  
Author(s):  
John Merlin ◽  
Anthony Hey

High Performance Fortran (HPF) is an informal standard for extensions to Fortran 90 to assist its implementation on parallel architectures, particularly for data-parallel computation. Among other things, it includes directives for specifying data distribution across multiple memories, and concurrent execution features. This article provides a tutorial introduction to the main features of HPF.


We introduce a physical analogy to describe problems and high-performance concurrent computers on which they are run. We show that the spatial characteristics of problems lead to their parallelism and review the lessons from use of the early hypercubes and a natural particle-process analogy. We generalize this picture to include the temporal structure of problems and show how this allows us to unify distributed, shared and hierarchical memories as well as SIMD (single instruction multiple data) architectures. We also show how neural network methods can be used to analyse a general formalism based on interacting strings and these lead to possible real-time schedulers and decomposers for massively parallel machines.


1992 ◽  
Vol 27 (7) ◽  
pp. 94-105 ◽  
Author(s):  
Marina Chen ◽  
James Cowie

1994 ◽  
Vol 3 (3) ◽  
pp. 187-199 ◽  
Author(s):  
Allan Knies ◽  
Matthew O'keefe ◽  
Tom Macdonald

The recently released high performance Fortran forum (HPFF) proposal has stirred much interest in the high performance computing industry. HPFF's most important design goal is to create a language that has source code portability and that achieves high performance on single instruction multiple data (SIMD), distributed-memory multiple instruction multiple data (MIMD), and shared-memory MIMD architectures. The HPFF proposal brings to the forefront many questions about design of portable and efficient languages for parallel machines. In this article, we discuss issues that need to be addressed before an efficient production quality compiler will be available for any such language. We examine some specific issues that are related to HPF's model of computation and analyze several implementation issues. We also provide some results from another data parallel compiler to help gain insight on some of the implementation issues that are relevant to HPF. Finally, we provide a summary of options currently available for application developers in industry.


Sign in / Sign up

Export Citation Format

Share Document