scholarly journals Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool

VLSI Design ◽  
2014 ◽  
Vol 2014 ◽  
pp. 1-18 ◽  
Author(s):  
Deepa Yagain ◽  
A. Vijaya Krishna

Retiming is a transformation which can be applied to digital filter blocks that can increase the clock frequency. This transformation requires computation of critical path and shortest path at various stages. In literature, this problem is addressed at multiple points. However, very little attention is given to path solver blocks in retiming transformation algorithm which takes up most of the computation time. In this paper, we address the problem of optimizing the speed of path solvers in retiming transformation by introducing high level synthesis of path solver algorithm architectures on FPGA and a computer aided design tool. Filters have their combination blocks as adders, multipliers, and delay elements. Avoiding costly multipliers is very much needed for filter hardware implementation. This can be achieved efficiently by using multiplierless MCM technique. In the present work, retiming which is a high level synthesis optimization method is combined with multiplierless filter implementations using MCM algorithm. It is seen that retiming multiplierless designs gives better performance in terms of operating frequency. This paper also compares various retiming techniques for multiplierless digital filter design with respect to VLSI performance metrics such as area, speed, and power.

2020 ◽  
Vol 29 (11) ◽  
pp. 2050179 ◽  
Author(s):  
Muhammad Rashid ◽  
Malik Imran ◽  
Atif Raza Jafri ◽  
Zahid Mehmood

This work has proposed a 4-stage pipelined architecture to achieve an optimized throughput over area ratio for point multiplication (PM) computation in binary huff curves (BHC) cryptography. The original mathematical formulation of BHC is revisited with an objective to reduce the required area. Consequently, a simplified formulation of BHC is obtained with 43% reduction in the hardware resources. As far as the throughput is concerned, it is improved first by reducing the critical path and second by minimizing the number of clock cycles (CCs) required to compute one PM. The critical path is reduced through the placement of pipeline registers, whereas the number of required CCs are minimized through an efficient scheduling of computations. These two factors i.e., the area reduction and throughput optimizations, have resulted in maximizing the throughput over area ratio. The proposed pipelined architecture is implemented over [Formula: see text] field, using standard NIST curve parameters. The architecture is modeled in Verilog and synthesized using Xilinx (ISE 14.7) design tool on Virtex 7 FPGA. The implementation results show that 17% improvement in clock frequency, 13% reduction in the time required to compute one PM and 2.6% improvement in throughput/area are achieved when compared with the most recent state of the art solutions.


2004 ◽  
Vol 5 (2) ◽  
pp. 86-94 ◽  
Author(s):  
Y. Song ◽  
J. S. M. Vergeest ◽  
W. F. Bronsvoort

Finding effective and efficient tools for complex freeform shape modification continues to be a challenging problem in computer graphics and computer-aided design. Although current approaches give reasonable results, their computation time and complexity often prevent their further development in more complex cases, especially in reusing an existing design. In this paper, for a better control of existing freeform shapes, deformable freeform feature templates are introduced. By the advantage of a small number of intrinsic parameters, a given freeform shape can be quickly approximated by one of the deformable templates. The deformable templates are further developed to track and match complex freeform shapes, resulting in extendable templates. With mappings, the original shape and the approximated template are associated. Thus, further shape manipulations can be conducted effectively using high-level intrinsic shape parameters. Experiments were carried out to verify the proposed algorithms. It is also described how the matching and manipulating techniques can be applied in computer graphics and computer-aided design applications.


2021 ◽  
Vol 29 (2) ◽  
Author(s):  
Panadda Solod ◽  
Nattha Jindapetch ◽  
Kiattisak Sengchuai ◽  
Apidet Booranawong ◽  
Pakpoom Hoyingcharoen ◽  
...  

In this work, we proposed High-Level Synthesis (HLS) optimization processes to improve the speed and the resource usage of complex algorithms, especially nested-loop. The proposed HLS optimization processes are divided into four steps: array sizing is performed to decrease the resource usage on Programmable Logic (PL) part, loop analysis is performed to determine which loop must be loop unrolling or loop pipelining, array partitioning is performed to resolve the bottleneck of loop unrolling and loop pipelining, and HLS interface is performed to select the best block level and port level interface for array argument of RTL design. A case study road lane detection was analyzed and applied with suitable optimization techniques to implement on the Xilinx Zynq-7000 family (Zybo ZC7010-1) which was a low-cost FPGA. From the experimental results, our proposed method reaches 6.66 times faster than the primitive method at clock frequency 100 MHz or about 6 FPS. Although the proposed methods cannot reach the standard real-time (25 FPS), they can instruct HLS developers for speed increasing and resource decreasing on an FPGA.


In this paper, an efficient RNS based multiply-accumulate (MAC) unit is proposed to implement residue number system (RNS) based finite impulse response filter (FIR). The proposed MAC (PMAC) approach reduces the number of adders in critical path delay. In this work, a FIR filter with PMAC approach is implemented using structural Verilog HDL language. The United Microelectronics Corporation 90 nm technology library has been used for synthesis. The performance metrics such as area, power and delay are obtained using Cadence RTL compiler. The synthesis results shows that RNS filter with PMAC improves clock frequency and reduces delay and area when compared to conventional MAC (CMAC).To compare the performance of the filters power delay product (PDP) is also considered. The PMAC architecture has improved PDP gain by 30.63% when compared to CMAC.


1997 ◽  
Vol 07 (06) ◽  
pp. 517-535 ◽  
Author(s):  
J. H. Satyanarayana ◽  
B. Nowrouzian

This paper is concerned with the exploitation of genetic algorithms and their application to the development of a new optimization technique for the high-level synthesis of digit-serial digital filter data-paths. In the resulting optimization technique, the cost associated with the final digital filter data-path is minimized subject to user-specified constraints on the number of physical arithmetic functional units employed. The proposed technique is capable of obtaining global area-optimal, time-optimal, or combined area-cum-time-optimal data-paths, where the optimality takes into account not only the cost associated with the required arithmetic functional units but also that associated with the required support cells (multiplexors and registers). This optimization is made computationally effective by encoding the digital filter data flow-graph into chromosomes which preserve the data-dependency relationships in the original digital filter signal flow-graph under the operations of crossover and mutation by the underlying genetic algorithm. The usefulness of the proposed technique is demonstrated by applying to the constrained optimization of a benchmark elliptic wave digital filter for full bit-serial, full bit-parallel, as well as general digit-serial high-level synthesis. The results thus obtained are compared to those of the existing techniques (whenever appropriate) to confirm the validity of the technique.


Sign in / Sign up

Export Citation Format

Share Document