VLIW-in-the-large: a model for fine grain parallelism exploitation on distributed memory multiprocessors

Author(s):  
M. Danelutto ◽  
M. Vanneschi
2006 ◽  
Vol 16 (02) ◽  
pp. 209-228 ◽  
Author(s):  
CHRIS JESSHOPE

This paper analyses the micro-threaded model of concurrency making comparisons with both data and instruction-level concurrency. The model is fine grain and provides synchronisation in a distributed register file, making it a promising candidate for scalable chip-multiprocessors. The micro-threaded model was first proposed in 1996 as a means to tolerate high latencies in data-parallel, distributed-memory multi-processors. This paper explores the model's opportunity to provide the simultaneous issue of instructions, required for chip multiprocessors, and discusses the issues of scalability with regard to support structures implementing the model and communication in supporting it. The model supports deterministic distribution of code fragments and dynamic scheduling of instructions from within those fragments. The hardware also recognises different classes of variables from the register specifiers, which allows the hardware to manage locality and optimise communication so that it is both efficient and scalable.


2003 ◽  
Vol 11 (2) ◽  
pp. 95-104 ◽  
Author(s):  
C. Addison ◽  
Y. Ren ◽  
M. van Waveren

Dense linear algebra libraries need to cope efficiently with a range of input problem sizes and shapes. Inherently this means that parallel implementations have to exploit parallelism wherever it is present. While OpenMP allows relatively fine grain parallelism to be exploited in a shared memory environment it currently lacks features to make it easy to partition computation over multiple array indices or to overlap sequential and parallel computations. The inherent flexible nature of shared memory paradigms such as OpenMP poses other difficulties when it becomes necessary to optimise performance across successive parallel library calls. Notions borrowed from distributed memory paradigms, such as explicit data distributions help address some of these problems, but the focus on data rather than work distribution appears misplaced in an SMP context.


2001 ◽  
Vol 11 (01) ◽  
pp. 169-184 ◽  
Author(s):  
PRASAD KAKULAVARAPU ◽  
OLIVIER C. MAQUELIN ◽  
JOSÉ NELSON AMARAL ◽  
GUANG R. GAO

Designing multi-processor systems that deliver a reasonable price-performance ratio using off-the-shelf processor and compiler technologies is a major challenge. For an important class of applications, it is critical to explore fine-grain parallelism to achieve reasonable performance. In such parallel systems it is essential to efficiently manage communication latencies, bandwidth, and synchronization overheads. In this paper we study load balancing strategies for the runtime system of a multi-threaded system. EARTH (Efficient Architecture for Running Threads) is a multi-threaded programming and execution model that supports fine-grain, non-preemptive, threads in a distributed memory environment. We describe the design and implementation of a set of dynamic load balancing algorithms, and study their performance in divide-and-conquer, regular, and irregular applications. Our experimental study on the distributed memory multi-processor IBP SP-2 indicate that a randomized load balancer perform as well as, and often better than, history based load balancers.


Author(s):  
R. Sinclair ◽  
B.E. Jacobson

INTRODUCTIONThe prospect of performing chemical analysis of thin specimens at any desired level of resolution is particularly appealing to the materials scientist. Commercial TEM-based systems are now available which virtually provide this capability. The purpose of this contribution is to illustrate its application to problems which would have been intractable until recently, pointing out some current limitations.X-RAY ANALYSISIn an attempt to fabricate superconducting materials with high critical currents and temperature, thin Nb3Sn films have been prepared by electron beam vapor deposition [1]. Fine-grain size material is desirable which may be achieved by codeposition with small amounts of Al2O3 . Figure 1 shows the STEM microstructure, with large (∽ 200 Å dia) voids present at the grain boundaries. Higher quality TEM micrographs (e.g. fig. 2) reveal the presence of small voids within the grains which are absent in pure Nb3Sn prepared under identical conditions. The X-ray spectrum from large (∽ lμ dia) or small (∽100 Ǻ dia) areas within the grains indicates only small amounts of A1 (fig.3).


Author(s):  
Harry Schaefer ◽  
Bruce Wetzel

High resolution 24mm X 36mm positive transparencies can be made from original black and white negatives produced by SEM, TEM, and photomicrography with ease, convenience, and little expense. The resulting 2in X 2in slides are superior to 3¼in X 4in lantern slides for storage, transport, and sturdiness, and projection equipment is more readily available. By mating a 35mm camera directly to an enlarger lens board (Fig. 1), one combines many advantages of both. The negative is positioned and illuminated with the enlarger and then focussed and photographed with the camera on a fine grain black and white film.Specifically, a Durst Laborator 138 S 5in by 7in enlarger with 240/200 condensers and a 500 watt Opale bulb (Ehrenreich Photo-Optical Industries, Inc., New York, NY) is rotated to the horizontal and adjusted for comfortable eye level viewing.


Sign in / Sign up

Export Citation Format

Share Document