An Overview on High Performance Issues of Parallel Architectures

2013 ◽  
Vol 1 (2) ◽  
pp. 11
Author(s):  
Koushik Chatterjee ◽  
Sumit Joshi
2000 ◽  
Vol 01 (02) ◽  
pp. 73-94
Author(s):  
A. FERREIRA ◽  
A. GOLDMAN ◽  
S. W. SONG

In most distributed memory MIMD multiprocessors, processors are connected by a point-to-point interconnection network, usually modeled by a graph where processors are nodes and communication links are edges. Since interprocessor communication frequently constitutes serious bottlenecks, several architectures were proposed that enhance point-to-point topologies with the help of multiple bus systems so as to improve the communication efficiency. In this paper we study parallel architectures where the communication means are constituted solely by buses. These architectures can use the power of bus technologies, providing a way to interconnect much more processors in a simple and efficient manner. We present the hyperpath, hypergrid, hyperring, and hypertorus architectures, which are the bus-based versions of the well used point-to-point interconnection networks. Using (hyper) graph theoretic concepts to model inter-processor communication in such networks, we give optimal algorithms for broadcasting a message from one processor to all the others. For deriving high performance communication patterns we developed a new tool called simplification. The idea is to construct a graph, to be called representative graph, from the original hyper-topology, in such a way that it will become easy to describe and perform communication schemes to the former that will fit to the latter, because the simplification concept also allows us to partially use some already known communication algorithms for usual networks.


1987 ◽  
Vol 14 (3-4) ◽  
pp. 16-32 ◽  
Author(s):  
Satish K Tripathi ◽  
Steve Kaisler ◽  
Sharat Chandran ◽  
Ashok K Agrawala

VLSI Design ◽  
2001 ◽  
Vol 12 (1) ◽  
pp. 1-12
Author(s):  
Jun Dong Cho ◽  
Jin Youn Cho

Placement of multiple dies on an MCM or high-performance VLSI substrate is a nontrivial task in which multiple criteria need to be considered simultaneously to obtain a true multi-objective optimization. Unfortunately, the exact physical attributes of a design are not known in the placement step until the entire design process is carried out. When the performance issues are considered, crosstalk noise constraints in the form of net separation and via constraint become important. In this paper, for better performance and wirability estimation during placement for MCMs, several performance constraints are taken into account simultaneously. A graph-based wirability estimation along with the Genetic placement optimization technique is proposed to minimize crosstalk, crossings, wirelength and the number of layers. Our work is significant since it is the first attempt at bringing the crosstalk and other performance issues into the placement domain.


2008 ◽  
Vol 3 (1) ◽  
pp. 32-38
Author(s):  
Enric Musoll ◽  
Mario Nemirovsky

High-performance single-threaded processors achieve their performance goal partly by relying, among other architectural techniques, on speculation and large on-chip caches. The hardware to support these techniques is usually a large portion of the overall processor real state area, and therefore it consumes a significant amount of power that sometimes is not optimally used toward doing useful work. In this work, we study the intuitive fact that architectures with hardware support for threads are more power efficient than a more traditional single-threaded superscalar architecture. Toward this goal, we have created a model of the power, performance and area of several parallel architectures. This model shows that a parallel architecture can be designed so that (a) it requires less area and power (to reach the same performance), or (b) it achieves better power efficiency and less area (for the same power budget), or (c) it has higher performance and better power efficiency (for the same area constraint), when compared to a single-threaded superscalar architecture.


Author(s):  
Rajendra V. Boppana ◽  
Suresh Chalasani ◽  
Bob Badgett ◽  
Jacqueline A. Pugh

In this article, we describe a parallel architecture for MEDLINE database integrated with search refinement tools to facilitate accurate and fast response to search requests by users. The proposed architecture, to be developed by the authors, will use low-cost, high-performance computing clusters consisting of Linux based personal computers and workstations (i) to provide subsecond response times for individual searches and (ii) to support several concurrent queries from search refinement programs such as SUMSearch.


Sign in / Sign up

Export Citation Format

Share Document