scholarly journals Optimum Repeater Insertion for on-chip Global Interconnects in High Performance Deep Submicron ICs

Author(s):  
P.V. Hunagund ◽  
A.B. Kalpana
2011 ◽  
Vol 2011 ◽  
pp. 1-6 ◽  
Author(s):  
P. Ezhumalai ◽  
A. Chilambuchelvan ◽  
C. Arun

Different intellectual property (IP) cores, including processor and memory, are interconnected to build a typical system-on-chip (SoC) architecture. Larger SoC designs dictate the data communication to happen over the global interconnects. Network-on-Chip(NoC) architectures have been proposed as a scalable solution to the global communication challenges in nanoscale systems-on-chip (SoC) design. We proposed an idea on building customizing synthesis network—on-chip with the better flow partitioning and also considered power and area reduction as compared to the already presented regular topologies. Hence to improve the performance of SoC, first, we did a performance study of regular interconnect topologies MESH, TORUS, BFT and EBFT, we observed that the overall latency and throughput of the EBFT is better compared to other topologies, The next best in case of latency and throughput is BFT. Experimental results on a variety of NoC benchmarks showed that our synthesis results were achieved reduction in power consumption and average hop count over custom topology implementation.


Author(s):  
Zhiyuan He ◽  
Zebo Peng ◽  
Petru Eles

High temperature has become a technological barrier to the testing of high performance systems-on-chip, especially when deep submicron technologies are employed. In order to reduce test time while keeping the temperature of the cores under test within a safe range, thermal-aware test scheduling techniques are required. In this chapter, the authors address the test time minimization problem as how to generate the shortest test schedule such that the temperature limits of individual cores and the limit on the test-bus bandwidth are satisfied. In order to avoid overheating during the test, the authors partition test sets into shorter test sub-sequences and add cooling periods in between, such that applying a test sub-sequence will not drive the core temperature going beyond the limit. Furthermore, based on the test partitioning scheme, the authors interleave the test sub-sequences from different test sets in such a manner that a cooling period reserved for one core is utilized for the test transportation and application of another core. The authors have proposed an approach to minimize the test application time by exploring alternative test partitioning and interleaving schemes with variable length of test sub-sequences and cooling periods as well as alternative test schedules. Experimental results have shown the efficiency of the proposed approach.


2016 ◽  
Vol 25 (09) ◽  
pp. 1650115
Author(s):  
Shuai Wang ◽  
Tao Jin ◽  
Chuanlei Zheng ◽  
Guangshan Duan

The degradation of CMOS devices over the lifetime can cause severe threat to the system performance and reliability at deep submicron semiconductor technologies. The negative bias temperature instability (NBTI) is among the most important sources of the aging mechanisms. Applying the traditional guardbanding technique to address the decreased speed of devices is too costly. On-chip memory structures, such as register files and on-chip caches, suffer a very high NBTI stress. In this paper, we propose the aging-aware design to combat the NBTI-induced aging in integer register files, data caches and instruction caches in high-performance microprocessors. The proposed aging-aware design can mitigate the negative aging effects by balancing the duty cycle ratio of the internal bits in on-chip memory structures. Besides the aging problem, the power consumption is also one of the most prominent issues in microprocessor design. Therefore, we further propose to apply the low power schemes to different memory structures under aging-aware design. The proposed low power aging-aware design can also achieve a significant power reduction, which will further reduce the temperature and NBTI degradation of the on-chip memory structures. Our experimental results show that our aging-aware design can effectively reduce the NBTI stress with 30.8%, 64.5% and 72.0% power saving for the integer register file, data cache and instruction cache, respectively.


VLSI Design ◽  
1999 ◽  
Vol 10 (1) ◽  
pp. 21-34 ◽  
Author(s):  
Andrew B. Kahng ◽  
Sudhakar Muddu ◽  
Egino Sarto

Interconnect tuning is an increasingly critical degree of freedom in the physical design of high-performance VLSI systems. By interconnect tuning, we refer to the selection of line thicknesses, widths and spacings in multi-layer interconnect to simultaneously optimize signal distribution, signal performance, signal integrity, and interconnect manufacturability and reliability. This is a key activity in most leading-edge design projects, but has received little attention in the literature. Our work provides the first technology-specific studies of interconnect tuning in the literature. We center on global wiring layers and interconnect tuning issues related to bus routing, repeater insertion, and choice of shielding/spacing rules for signal integrity and performance. We address four basic questions. (1) How should width and spacing be allocated to maximize performance for a given line pitch? (2) For a given line pitch, what criteria affect the optimal interval at which repeaters Should be inserted into global interconnects? (3) Under what circumstances are shield wires the optimum technique for improving interconnect performance? (4) In global interconnect with repeaters, what other interconnect tuning is possible? Our study of question (4) demonstrates a new approach of offsetting repeater placements that can reduce worst-case cross-chip delays by over 30% in current technologies.


1994 ◽  
Vol 337 ◽  
Author(s):  
Shin-Puu Jeng ◽  
Robert H. Havemann ◽  
Mi-Chang Chang

ABSTRACTInterconnect delay is shown to be a performance-limiting factor for ULSI circuits when feature size is scaled into the deep submicron region, due to a rapid increase in interconnect resistivity and capacitance. Dielectric materials with lower values of permittivity are needed to reduce the line-to-line capacitance as metal spacing decreases. However, the challenge is to successfully integrate these materials into on-chip interconnects. A new multilevel interconnect scheme has been developed that gives improved performance through insertion of a low-dielectric-constant material between metal leads. A novel polymer/Si02 composite dielectric structure provides lower line-to-line capacitance while alleviating many of the integration and reliability problems associated with polymers in standard interconnect processing.


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


2014 ◽  
Vol 27 (7) ◽  
pp. 669-675 ◽  
Author(s):  
Feng Yue ◽  
Runfeng Li ◽  
Tian Chen ◽  
Jun Liu ◽  
Peng Chen ◽  
...  
Keyword(s):  

2020 ◽  
Vol 96 (3s) ◽  
pp. 585-588
Author(s):  
С.Е. Фролова ◽  
Е.С. Янакова

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.


Nanophotonics ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. 937-945
Author(s):  
Ruihuan Zhang ◽  
Yu He ◽  
Yong Zhang ◽  
Shaohua An ◽  
Qingming Zhu ◽  
...  

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.


Sign in / Sign up

Export Citation Format

Share Document