scholarly journals An architecture for high-performance scalable shared-memory multiprocessors exploiting on-chip integration

2004 ◽  
Vol 15 (8) ◽  
pp. 755-768 ◽  
Author(s):  
M.E. Acacio ◽  
J. Gonzalez ◽  
J.M. Garcia ◽  
J. Duato
1995 ◽  
Vol 05 (03) ◽  
pp. 475-487
Author(s):  
N. DRACH ◽  
A. GEFFLAUT ◽  
P. JOUBERT ◽  
A. SEZNEC

Sizes of on-chip caches on current commercial microprocessors range from 16 Kbytes to 36 Kbytes. These microprocessors can be directly used in the design of a low cost single-bus shared memory multiprocessors without using any second-level cache. In this paper, we explore the viability of such a multi-microprocessor. Simulations results clearly establish that performance of such a system will be quite poor if on-chip caches are direct-mapped. On the other hand, when the on-chip caches are partially associative, the achieved level of performance is quite promising. In particular, two recently proposed innovative cache structures, the skewed-associative cache organization and the semi-unified cache organization are shown to work fine.


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


2014 ◽  
Vol 27 (7) ◽  
pp. 669-675 ◽  
Author(s):  
Feng Yue ◽  
Runfeng Li ◽  
Tian Chen ◽  
Jun Liu ◽  
Peng Chen ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document