Reducing synchronization overhead for compiler-parallelized codes on software DSMs (extended abstract)

Author(s):  
Hwansoo Han ◽  
Chau-Wen Tseng ◽  
Pete Keleher
2013 ◽  
Vol 20 (6) ◽  
pp. 1582-1591 ◽  
Author(s):  
Rui Zhu ◽  
Lei-hua Qin ◽  
Jing-li Zhou ◽  
Huan Zheng

Author(s):  
James Dinan ◽  
Clement Cole ◽  
Gabriele Jost ◽  
Stan Smith ◽  
Keith Underwood ◽  
...  

2012 ◽  
Vol 263-266 ◽  
pp. 1492-1496
Author(s):  
Jin Ho Ahn

Two opposite approaches were proposed to address some scalability problem resulting from coordinated checkpointing's synchronization during failure-free operation: minimizing the number of checkpointing participants and having the checkpointing process non-blocking. However, these previous approaches, oblivious to the underlying network, may not fundamentally provide any breakthrough for ensuring high scalability required in very large-scale P2P-based systems. This paper proposes a non-blocking coordinated checkpointing protocol to significantly reduce checkpointing synchronization overhead by structuring the peer-to-peer network into a set of groups according to a particular criterion. In this protocol, among processes in a group, one is designated as representative with the following special roles, intra-group and inter-group checkpointing coordination. Intra-group checkpointing coordination addresses the checkpointing procedure among processes within a group. On the other hand, inter-group checkpointing coordination is performed only among representatives. Thanks to this beneficial feature, the proposed protocol may considerably reduce the number of checkpointing control messages routed on core networks compared with the existing ones.


Author(s):  
Meilian Xu ◽  
Parimala Thulasiraman ◽  
Ruppa K. Thulasiram

This chapter uses two scientific computing kernels to illustrate challenges of designing parallel algorithms for one heterogeneous multi-core processor, the Cell Broadband Engine processor (Cell/B.E.). It describes the limitation of the current parallel systems using single-core processors as building blocks. The limitation deteriorates the performance of applications which have data-intensive and computationintensive kernels such as Finite Difference Time Domain (FDTD) and Fast Fourier Transform (FFT). FDTD is a regular problem with nearest neighbour comminuncation pattern under synchronization constraint. FFT based on indirect swap network (ISN) modifies the data mapping in traditional Cooley- Tukey butterfly network to improve data locality, hence reducing the communication and synchronization overhead. The authors hope to unleash the Cell/B.E. and design parallel FDTD and parallel FFT based on ISN by taking into account unique features of Cell/B.E. such as its eight SIMD processing units on the single chip and its high-speed on-chip bus.


1996 ◽  
Vol 26 (1) ◽  
pp. 86-95 ◽  
Author(s):  
Ulana Legedza ◽  
William E. Weihl

Author(s):  
P.T. Gonciari ◽  
B. Al-Hashimi ◽  
N. Nicolici

Sign in / Sign up

Export Citation Format

Share Document