Impact of Loop Unrolling on Area, Throughput and Clock Frequency for Window Operations Based on a Data Schedule Method

Author(s):  
Yazhuo Dong ◽  
Jie Zhou ◽  
Yong Dou ◽  
Lin Deng ◽  
Jinjing Zhao
2021 ◽  
Vol 29 (2) ◽  
Author(s):  
Panadda Solod ◽  
Nattha Jindapetch ◽  
Kiattisak Sengchuai ◽  
Apidet Booranawong ◽  
Pakpoom Hoyingcharoen ◽  
...  

In this work, we proposed High-Level Synthesis (HLS) optimization processes to improve the speed and the resource usage of complex algorithms, especially nested-loop. The proposed HLS optimization processes are divided into four steps: array sizing is performed to decrease the resource usage on Programmable Logic (PL) part, loop analysis is performed to determine which loop must be loop unrolling or loop pipelining, array partitioning is performed to resolve the bottleneck of loop unrolling and loop pipelining, and HLS interface is performed to select the best block level and port level interface for array argument of RTL design. A case study road lane detection was analyzed and applied with suitable optimization techniques to implement on the Xilinx Zynq-7000 family (Zybo ZC7010-1) which was a low-cost FPGA. From the experimental results, our proposed method reaches 6.66 times faster than the primitive method at clock frequency 100 MHz or about 6 FPS. Although the proposed methods cannot reach the standard real-time (25 FPS), they can instruct HLS developers for speed increasing and resource decreasing on an FPGA.


2020 ◽  
pp. 35-38
Author(s):  
S.I. Donchenko ◽  
I.Y. Blinov ◽  
I.B. Norets ◽  
Y.F. Smirnov ◽  
A.A. Belyaev ◽  
...  

The latest changes in the algorithm for the formation of the international atomic time scale TAI are reported in terms of estimating the weights of the clocks involved in the formation of TAI. Studies of the characteristics of the long-term instability of new-generation hydrogen masers based on processing the results of the clock frequency difference with respect to TAI are performed. It has been confirmed that at present, new-generation hydrogen masers show significantly less long-term instability in comparison with quantum frequency standards ofsimilar and other types.


1997 ◽  
Vol 473 ◽  
Author(s):  
J. A. Davis ◽  
J. D. Meindl

ABSTRACTOpportunities for Gigascale Integration (GSI) are governed by a hierarchy of physical limits. The levels of this hierarchy have been codified as: 1) fundamental, 2) material, 3) device, 4) circuit and 5) system. Many key limits at all levels of the hierarchy can be displayed in the power, P, versus delay, td, plane and the reciprocal length squared, L-2, versus response time, τ, plane. Power, P, is the average power transfer during a binary switching transition and delay, td, is the time required for the transition. Length, L, is the distance traversed by an interconnect that joins two nodes on a chip and response time, τ, characterizes the corresponding interconnect circuit. At the system level of the hierarchy, quantitative definition of both the P versus td and the L-2 versus τ displays requires an estimate of the complete stochastic wiring distribution of a chip.Based on Rent's Rule, a well known empirical relationship between the number of signal input/output terminals on a block of logic and the number of gate circuits with the block, a rigorous derivation of a new complete stochastic wire length distribution for an on-chip random logic network is described. This distribution is compared to actual data for modern microprocessors and to previously described distributions. A methodology for estimating the complete wire length distribution for future GSI products is proposed. The new distribution is then used to enhance the critical path model that determines the maximum clock frequency of a chip; to derive a preliminary power dissipation model for a random logic network; and, to define an optimal architecture of a multilevel interconnect network that minimizes overall chip size. In essence, a new complete stochastic wiring distribution provides a generic basis for maximizing the value obtained from a multilevel interconnect technology.


Author(s):  
Deepika Bansal ◽  
Bal Chand Nagar ◽  
Brahamdeo Prasad Singh ◽  
Ajay Kumar

Background & Objective: In this paper, a modified pseudo domino configuration has been proposed to improve the leakage power consumption and Power Delay Product (PDP) of dynamic logic using Carbon Nanotube MOSFETs (CN-MOSFETs). The simulations for proposed and published domino circuits are verified by using Synopsys HSPICE simulator with 32nm CN-MOSFET technology which is provided by Stanford. Methods: The simulation results of the proposed technique are validated for improvement of wide fan-in domino OR gate as a benchmark circuit at 500 MHz clock frequency. Results: The proposed configuration is suitable for cascading of the high performance wide fan-in circuits without any charge sharing. Conclusion: The performance analysis of 8-input OR gate demonstrate that the proposed circuit provides lower static and dynamic power consumption up to 62 and 40% respectively, and PDP improvement is 60% as compared to standard domino circuit.


2008 ◽  
Vol 43 (7) ◽  
pp. 141-150
Author(s):  
Mounira Bachir ◽  
Sid-Ahmed-Ali Touati ◽  
Albert Cohen

Sign in / Sign up

Export Citation Format

Share Document