Towards scalable collective communication for multicomputer interconnection networks

2004 ◽  
Vol 163 (4) ◽  
pp. 293-306 ◽  
Author(s):  
A.Y. Al-Dubai ◽  
M. Ould-Khaoua ◽  
K. El-Zayyat ◽  
I. Ababneh ◽  
S. Al-Dobai
2005 ◽  
Vol 15 (01n02) ◽  
pp. 209-222
Author(s):  
DOMINIQUE BARTH ◽  
PASCAL BERTHOME ◽  
PARASKEVI FRAGOPOULOU

A multipoint request is a group of collaborating nodes that wish to establish a communication for a certain duration of time. This need arises in parallel applications executed on processing elements connected either by specialized interconnection networks or over wide area networks (collective communication operations). Each individual request is satisfied by a given subtree connecting the participating nodes. We aim to maximize the number of requests that can be simultaneously satisfied. In this paper, we show that this problem is NP-complete and we propose for it an approximation algorithm provided that the number of requests using the same edge is bounded by a constant.


2014 ◽  
Vol 11 (2) ◽  
pp. 79
Author(s):  
A.R. Touzene ◽  
K. Day

In (Ku et al. 2003), the authors have proposed a construction of edge-disjoint spanning trees EDSTs in undirected product networks. Their construction method focuses more on showing the existence of a maximum number (n1+n2-1) of EDSTs in product network of two graphs, where factor graphs have respectively n1 and n2 EDSTs. In this paper, we propose a new systematic and algorithmic approach to construct (n1+n2) directed routed EDST in the product networks. The direction of an edge is added to support bidirectional links in interconnection networks. Our EDSTs can be used straightforward to develop efficient collective communication algorithms for both models store-and-forward and wormhole. 


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


2009 ◽  
Vol 31 (2) ◽  
pp. 318-328
Author(s):  
Jue WANG ◽  
Chang-Jun HU ◽  
Ji-Lin ZHANG ◽  
Jian-Jiang LI

1983 ◽  
Vol 11 (3) ◽  
pp. 309-315 ◽  
Author(s):  
W. Kent Fuchs ◽  
Jacob A. Abraham ◽  
Kuang-Hua Huang

Sign in / Sign up

Export Citation Format

Share Document