VLSI Implementation of Hybrid Wave-Pipelined 2D DWT Using Lifting Scheme

VLSI Design ◽

10.1155/2008/512746 ◽

2008 ◽

Vol 2008 ◽

pp. 1-8

Author(s):

G. Seetharaman ◽

B. Venkataramani ◽

G. Lakshminarayanan

Keyword(s):

Digital Circuit ◽

San Jose ◽

Hybrid Wave ◽

Hybrid Scheme ◽

Clock Skew ◽

Clock Period ◽

Novel Approach ◽

Clock Routing ◽

Wave Pipelining ◽

On Chip

A novel approach is proposed in this paper for the implementation of 2D DWT using hybrid wave-pipelining (WP). A digital circuit may be operated at a higher frequency by using either pipelining or WP. Pipelining requires additional registers and it results in more area, power dissipation and clock routing complexity. Wave-pipelining does not have any of these disadvantages but requires complex trial and error procedure for tuning the clock period and clock skew between input and output registers. In this paper, a hybrid scheme is proposed to get the benefits of both pipelining and WP techniques. In this paper, two automation schemes are proposed for the implementation of 2D DWT using hybrid WP on both Xilinx, San Jose, CA, USA and Altera FPGAs. In the first scheme, Built-in self-test (BIST) approach is used to choose the clock skew and clock period for I/O registers between the wave-pipelined blocks. In the second approach, an on-chip soft-core processor is used to choose the clock skew and clock period. The results for the hybrid WP are compared with nonpipelined and pipelined approaches. From the implementation results, the hybrid WP scheme requires the same area but faster than the nonpipelined scheme by a factor of 1.25–1.39. The pipelined scheme is faster than the hybrid scheme by a factor of 1.15–1.39 at the cost of an increase in the number of registers by a factor of 1.78–2.73, increase in the number of LEs by a factor of 1.11–1.32 and it increases the clock routing complexity.

Download Full-text

Operating System for Runtime Reconfigurable Multiprocessor Systems

International Journal of Reconfigurable Computing ◽

10.1155/2011/121353 ◽

2011 ◽

Vol 2011 ◽

pp. 1-16 ◽

Cited By ~ 16

Author(s):

Diana Göhringer ◽

Michael Hübner ◽

Etienne Nguepi Zeutebouo ◽

Jürgen Becker

Keyword(s):

Operating System ◽

Resource Management ◽

Multiprocessor System ◽

Task Mapping ◽

Access Port ◽

Novel Approach ◽

Hardware Resource ◽

Hardware Architectures ◽

On Chip ◽

Internal Configuration

Operating systems traditionally handle the task scheduling of one or more application instances on processor-like hardware architectures. RAMPSoC, a novel runtime adaptive multiprocessor System-on-Chip, exploits the dynamic reconfiguration on FPGAs to generate, start and terminate hardware and software tasks. The hardware tasks have to be transferred to the reconfigurable hardware via a configuration access port. The software tasks can be loaded into the local memory of the respective IP core either via the configuration access port or via the on-chip communication infrastructure (e.g. a Network-on-Chip). Recent-series of Xilinx FPGAs, such as Virtex-5, provide two Internal Configuration Access Ports, which cannot be accessed simultaneously. To prevent conflicts, the access to these ports as well as the hardware resource management needs to be controlled, e.g. by a special-purpose operating system running on an embedded processor. For that purpose and to handle the relations between temporally and spatially scheduled operations, the novel approach of an operating system is of high importance. This special purpose operating system, called CAP-OS (Configuration Access Port-Operating System), which will be presented in this paper, supports the clients using the configuration port with the services of priority-based access scheduling, hardware task mapping and resource management.

Download Full-text

MULTI-PHASE ROTARY CLOCK SYNCHRONIZATION OF LEVEL-SENSITIVE CIRCUITS

Journal of Circuits System and Computers ◽

10.1142/s0218126609005423 ◽

2009 ◽

Vol 18 (05) ◽

pp. 899-908 ◽

Cited By ~ 2

Author(s):

BARIS TASKIN ◽

IVAN KOURTEV

Keyword(s):

Phase Synchronization ◽

Clock Synchronization ◽

Clock Skew ◽

Clock Period ◽

Flip Flop ◽

Clock Networks ◽

Significant Performance ◽

Clock Skew Scheduling ◽

Time Borrowing ◽

Multi Phase

Resonant clocking technologies provide clock networks with improved frequency, jitter and power dissipation characteristics, however, often require novel automation routines. Resonant rotary clocking technology, for instance, entails multi-phase and nonzero clock skew operation and supports latch-based design. This paper studies the effects of multi-phase synchronization schemes on the minimum clock period for rotary-clock-synchronized circuits, which necessitate the application of clock skew scheduling and employ level-sensitive registers. In experimentation, single, dual, three- and four-phase clocking schemes generated by rotary clock synchronization are applied to a suite of level-sensitive-transformed ISCAS'89 benchmarks. Average clock period improvements of 30.3%, 24.8%, 17.7% and 12.0%, respectively, are observed on average compared to the flip-flop based, zero clock skew circuits. As the number of clock phases increases, smaller improvements are observed due to lesser overall effectiveness of the complementary effects of clock skew scheduling and time borrowing. It is shown, however, that for some circuits (23% of the benchmarks), multi-phase synchronization leads to significant performance benefits in operating frequency.

Download Full-text

A novel approach of test and fault isolation of high speed digital circuit modules

2016 IEEE AUTOTESTCON ◽

10.1109/autest.2016.7589639 ◽

2016 ◽

Author(s):

Du Shuming ◽

Wang Yan ◽

Cao Zijian

Keyword(s):

High Speed ◽

Digital Circuit ◽

Fault Isolation ◽

Novel Approach

Download Full-text

STATISTICAL TIMING ANALYSIS OF THE CLOCK PERIOD IMPROVEMENT THROUGH CLOCK SKEW SCHEDULING

Journal of Circuits System and Computers ◽

10.1142/s0218126611007669 ◽

2011 ◽

Vol 20 (05) ◽

pp. 881-898 ◽

Cited By ~ 1

Author(s):

SHANNON M. KURTAS ◽

BARIS TASKIN

Keyword(s):

Timing Analysis ◽

Process Variations ◽

Clock Signal ◽

Static Timing Analysis ◽

Clock Skew ◽

Clock Period ◽

Static Timing ◽

Statistical Static Timing Analysis ◽

Clock Skew Scheduling ◽

Zero Skew

Statistical static timing analysis (SSTA) methods, which model process variations statistically as probability distribution function rather than deterministically, have been thoroughly performed on traditional zero clock skew circuits. In the traditional zero clock skew circuits, the synchronizing clock signal is designed to arrive in phase with respect to each register. However, designers will often schedule the clock skew to different registers in order to decrease the minimum clock period of the entire circuit. Clock skew scheduling imparts very different timing constraints that are based, in part, on the topology of the circuit. In this paper, SSTA is applied to nonzero clock skew circuits in order to determine the accuracy improvement relative to their zero skew counterparts, and also to assess how the results of skew scheduling might be impacted with more accurate statistical modeling. For 99.7% timing yield (3σ variation), SSTA is observed to improve the accuracy, and therefore increase the timing margin, of nonzero clock skew circuits by up to 2.5×, and on average by 1.3×, the amount seen by zero skew circuits.

Download Full-text

IMPROVED DELAY AND PROCESS VARIATION TOLERANT CLOCK TREE NETWORK IN ULTRA-LARGE CIRCUITS USING HYBRID RF/METAL CLOCK ROUTING

Journal of Circuits System and Computers ◽

10.1142/s0218126614500509 ◽

2014 ◽

Vol 23 (04) ◽

pp. 1450050

Author(s):

ZOHRE MOHAMMADI-ARFA ◽

ALI JAHANIAN

Keyword(s):

Process Variation ◽

Clock Tree ◽

Large Power ◽

Trade Off ◽

Area Overhead ◽

Wireless Interconnect ◽

Clock Routing ◽

On Chip ◽

Complex Circuits ◽

Networking Architecture

Clock distribution has been a major limitation on delay, power and routing resources in ultra-large nanoscale circuits. Some emerging technologies are proposed to use RF instruments for on-chip clock routing in large chips but they suffer from large power and area overheads. In this paper, a hybrid radio frequency (RF) and metal clock networking architecture corresponding with an efficient RF and metal clock routing is presented which combines the benefits of RF/wireless interconnect and metal/wired connections to reach a reasonable trade-off between RF and metal interconnect technologies. Our experiments show that clock network delay and clock tree congestion is improved by 61% and 40% on average. Moreover, sensitivity of attempted benchmarks to process variation of interconnects is reduced considerably. These improvements are gained at a cost of less than 2% of area overhead and less than 10% power consumption overhead for large circuits. It is shown that overheads are very small for large circuits such that this technology will be completely feasible and reasonable for too large and complex circuits.

Download Full-text

A New Fault Tolerant Routing Algorithm for Networks on Chip

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2019070105 ◽

2019 ◽

Vol 10 (3) ◽

pp. 68-85

Author(s):

Chakib Nehnouh ◽

Mohamed Senouci

Keyword(s):

Network Architecture ◽

Fault Tolerant ◽

Routing Algorithm ◽

Transient Faults ◽

Networks On Chip ◽

Congestion Detection ◽

Novel Approach ◽

On Chip ◽

Detection Mechanisms ◽

Correct Data

To provide correct data transmission and to handle the communication requirements, the routing algorithm should find a new path to steer packets from the source to the destination in a faulty network. Many solutions have been proposed to overcome faults in network-on-chips (NoCs). This article introduces a new fault-tolerant routing algorithm, to tolerate permanent and transient faults in NoCs. This solution called DINRA can satisfy simultaneously congestion avoidance and fault tolerance. In this work, a novel approach inspired by Catnap is proposed for NoCs using local and global congestion detection mechanisms with a hierarchical sub-network architecture. The evaluation (on reliability, latency and throughput) shows the effectiveness of this approach to improve the NoC performances compared to state of art. In addition, with the test module and fault register integrated in the basic architecture, the routers are able to detect faults dynamically and re-route packets to fault-free and congestion-free zones.

Download Full-text