Optimum Repeater Insertion for on-chip Global Interconnects in High Performance Deep Submicron ICs

Novel NoC Topology Construction for High-Performance Communications

Journal of Computer Networks and Communications ◽

10.1155/2011/405697 ◽

2011 ◽

Vol 2011 ◽

pp. 1-6 ◽

Cited By ~ 4

Author(s):

P. Ezhumalai ◽

A. Chilambuchelvan ◽

C. Arun

Keyword(s):

High Performance ◽

Data Communication ◽

Network On Chip ◽

Performance Study ◽

Hop Count ◽

Nanoscale Systems ◽

Area Reduction ◽

Ip Cores ◽

Global Interconnects ◽

On Chip

Different intellectual property (IP) cores, including processor and memory, are interconnected to build a typical system-on-chip (SoC) architecture. Larger SoC designs dictate the data communication to happen over the global interconnects. Network-on-Chip(NoC) architectures have been proposed as a scalable solution to the global communication challenges in nanoscale systems-on-chip (SoC) design. We proposed an idea on building customizing synthesis network—on-chip with the better flow partitioning and also considered power and area reduction as compared to the already presented regular topologies. Hence to improve the performance of SoC, first, we did a performance study of regular interconnect topologies MESH, TORUS, BFT and EBFT, we observed that the overall latency and throughput of the EBFT is better compared to other topologies, The next best in case of latency and throughput is BFT. Experimental results on a variety of NoC benchmarks showed that our synthesis results were achieved reduction in power consumption and average hop count over custom topology implementation.

Download Full-text

Thermal-Aware SoC Test Scheduling

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch019 ◽

2011 ◽

pp. 413-433 ◽

Cited By ~ 1

Author(s):

Zhiyuan He ◽

Zebo Peng ◽

Petru Eles

Keyword(s):

High Performance ◽

Deep Submicron ◽

Test Time ◽

Test Scheduling ◽

Cooling Period ◽

Time Minimization ◽

Partition Test ◽

Alternative Test ◽

Test Sets ◽

On Chip

High temperature has become a technological barrier to the testing of high performance systems-on-chip, especially when deep submicron technologies are employed. In order to reduce test time while keeping the temperature of the cores under test within a safe range, thermal-aware test scheduling techniques are required. In this chapter, the authors address the test time minimization problem as how to generate the shortest test schedule such that the temperature limits of individual cores and the limit on the test-bus bandwidth are satisfied. In order to avoid overheating during the test, the authors partition test sets into shorter test sub-sequences and add cooling periods in between, such that applying a test sub-sequence will not drive the core temperature going beyond the limit. Furthermore, based on the test partitioning scheme, the authors interleave the test sub-sequences from different test sets in such a manner that a cooling period reserved for one core is utilized for the test transportation and application of another core. The authors have proposed an approach to minimize the test application time by exploring alternative test partitioning and interleaving schemes with variable length of test sub-sequences and cooling periods as well as alternative test schedules. Experimental results have shown the efficiency of the proposed approach.

Download Full-text

Low Power Aging-Aware On-Chip Memory Structure Design by Duty Cycle Balancing

Journal of Circuits System and Computers ◽

10.1142/s0218126616501152 ◽

2016 ◽

Vol 25 (09) ◽

pp. 1650115

Author(s):

Shuai Wang ◽

Tao Jin ◽

Chuanlei Zheng ◽

Guangshan Duan

Keyword(s):

Low Power ◽

Duty Cycle ◽

High Performance ◽

Power Saving ◽

Deep Submicron ◽

Structure Design ◽

Aging Effects ◽

Register Files ◽

On Chip ◽

Memory Structures

The degradation of CMOS devices over the lifetime can cause severe threat to the system performance and reliability at deep submicron semiconductor technologies. The negative bias temperature instability (NBTI) is among the most important sources of the aging mechanisms. Applying the traditional guardbanding technique to address the decreased speed of devices is too costly. On-chip memory structures, such as register files and on-chip caches, suffer a very high NBTI stress. In this paper, we propose the aging-aware design to combat the NBTI-induced aging in integer register files, data caches and instruction caches in high-performance microprocessors. The proposed aging-aware design can mitigate the negative aging effects by balancing the duty cycle ratio of the internal bits in on-chip memory structures. Besides the aging problem, the power consumption is also one of the most prominent issues in microprocessor design. Therefore, we further propose to apply the low power schemes to different memory structures under aging-aware design. The proposed low power aging-aware design can also achieve a significant power reduction, which will further reduce the temperature and NBTI degradation of the on-chip memory structures. Our experimental results show that our aging-aware design can effectively reduce the NBTI stress with 30.8%, 64.5% and 72.0% power saving for the integer register file, data cache and instruction cache, respectively.

Download Full-text

Tuning Strategies for Global Interconnects in High-Performance Deep-Submicron ICs

VLSI Design ◽

10.1155/1999/38974 ◽

1999 ◽

Vol 10 (1) ◽

pp. 21-34 ◽

Cited By ~ 14

Author(s):

Andrew B. Kahng ◽

Sudhakar Muddu ◽

Egino Sarto

Keyword(s):

High Performance ◽

Signal Integrity ◽

Leading Edge ◽

Deep Submicron ◽

Worst Case ◽

Optimal Interval ◽

Bus Routing ◽

Optimum Technique ◽

Global Interconnects ◽

And Performance

Interconnect tuning is an increasingly critical degree of freedom in the physical design of high-performance VLSI systems. By interconnect tuning, we refer to the selection of line thicknesses, widths and spacings in multi-layer interconnect to simultaneously optimize signal distribution, signal performance, signal integrity, and interconnect manufacturability and reliability. This is a key activity in most leading-edge design projects, but has received little attention in the literature. Our work provides the first technology-specific studies of interconnect tuning in the literature. We center on global wiring layers and interconnect tuning issues related to bus routing, repeater insertion, and choice of shielding/spacing rules for signal integrity and performance. We address four basic questions. (1) How should width and spacing be allocated to maximize performance for a given line pitch? (2) For a given line pitch, what criteria affect the optimal interval at which repeaters Should be inserted into global interconnects? (3) Under what circumstances are shield wires the optimum technique for improving interconnect performance? (4) In global interconnect with repeaters, what other interconnect tuning is possible? Our study of question (4) demonstrates a new approach of offsetting repeater placements that can reduce worst-case cross-chip delays by over 30% in current technologies.

Download Full-text

Process Integration and Manufacturasility Issues for High Performance Multilevel Interconnect

MRS Proceedings ◽

10.1557/proc-337-25 ◽

1994 ◽

Vol 337 ◽

Cited By ~ 39

Author(s):

Shin-Puu Jeng ◽

Robert H. Havemann ◽

Mi-Chang Chang

Keyword(s):

High Performance ◽

Process Integration ◽

Dielectric Materials ◽

Deep Submicron ◽

Limiting Factor ◽

Feature Size ◽

Low Dielectric ◽

Interconnect Delay ◽

Improved Performance ◽

On Chip

ABSTRACTInterconnect delay is shown to be a performance-limiting factor for ULSI circuits when feature size is scaled into the deep submicron region, due to a rapid increase in interconnect resistivity and capacitance. Dielectric materials with lower values of permittivity are needed to reduce the line-to-line capacitance as metal spacing decreases. However, the challenge is to successfully integrate these materials into on-chip interconnects. A new multilevel interconnect scheme has been developed that gives improved performance through insertion of a low-dielectric-constant material between metal leads. A novel polymer/Si02 composite dielectric structure provides lower line-to-line capacitance while alleviating many of the integration and reliability problems associated with polymers in standard interconnect processing.

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Design and simulation of high performance router on chip based on random routing

JOURNAL OF ELECTRONIC MEASUREMENT AND INSTRUMENT ◽

10.3724/sp.j.1187.2013.00669 ◽

2014 ◽

Vol 27 (7) ◽

pp. 669-675 ◽

Cited By ~ 1

Author(s):

Feng Yue ◽

Runfeng Li ◽

Tian Chen ◽

Jun Liu ◽

Peng Chen ◽

...

Keyword(s):

High Performance ◽

On Chip

Download Full-text

МЕТОДЫ ДОСТИЖЕНИЯ МАКСИМАЛЬНОЙ ЭФФЕКТИВНОСТИ ПЛАТФОРМЫ ПРОТОТИПИРОВАНИЯ ВЫСОКОПРОИЗВОДИТЕЛЬНЫХ СИСТЕМ НА КРИСТАЛЛЕ НА ЗАДАЧАХ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА

Nanoindustry Russia ◽

10.22184/1993-8578.2020.13.3s.585.588 ◽

2020 ◽

Vol 96 (3s) ◽

pp. 585-588

Author(s):

С.Е. Фролова ◽

Е.С. Янакова

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Computer Vision ◽

High Performance ◽

Systems On Chip ◽

High Performance Systems ◽

On Chip ◽

Network Technologies ◽

Neural Network Technologies

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.

Download Full-text

High performance and highly reliable deep submicron CMOSFETs using nitrided-oxide

1999 Symposium on VLSI Technology. Digest of Technical Papers (IEEE Cat. No.99CH36325) ◽

10.1109/vlsit.1999.799371 ◽

2003 ◽

Cited By ~ 1

Author(s):

K. Irino ◽

Y. Tamura ◽

S. Ohkubo ◽

T. Nakanishi ◽

M. Shigeno ◽

...

Keyword(s):

High Performance ◽

Deep Submicron

Download Full-text

Ultracompact and low-power-consumption silicon thermo-optic switch for high-speed data

Nanophotonics ◽

10.1515/nanoph-2020-0496 ◽

2020 ◽

Vol 10 (2) ◽

pp. 937-945

Author(s):

Ruihuan Zhang ◽

Yu He ◽

Yong Zhang ◽

Shaohua An ◽

Qingming Zhu ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

High Performance ◽

Pulse Amplitude ◽

Telecommunication Networks ◽

Low Power Consumption ◽

Power Efficient ◽

High Speed Data ◽

On Chip

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.

Download Full-text