scholarly journals A NOVEL VLSI ARCHITECTURE OF HIGH SPEED 1D DISCRETE WAVELET TRANSFORM

Author(s):  
POOJA GUPTA ◽  
Saroj Kumar Lenka

This paper describes an efficient implementation for a multi-level convolution based 1-D DWT hardware architecture for use in FPGAs. The proposed architecture combines some hardware optimization techniques to develop a novel DWT architecture that has high performance and is suitable for portable and high speed devices. The first step towards the hardware implementation of the DWT algorithm was to choose the type of FIR filter block. Firstly we design the high speed linear phase FIR filter using pipelined and parallel arithmetic methods. This proposed filter employs efficiently distributed D-latches and multipliers. Furthermore this filter is used in the proposed DWT architecture. Thus, the new VLSI architecture based on combining of fast FIR filters for reducing the critical path delay and data interleaving technique for lower chip area. We synthesized the final design using Xilinx 9.1i ISE tool. We illustrate that a DWT design using a pipelined linear phase FIR filter coupled with data-interleaving gives the best combination of the performance metrics when compared to other DWT structures.

2020 ◽  
Vol 29 (09) ◽  
pp. 2050151
Author(s):  
Anirban Chakraborty ◽  
Ayan Banerjee

Dedicated hardware for “Discrete Wavelet Transform” (DWT) is at high demand for real-time imaging operations in any standalone electronic devices, as DWT is being extensively utilized for most of the transform-domain imagery applications. Various DWT algorithms exist in the literature facilitating its software implementations which are generally unsuitable for real-time imaging in any stand-alone devices due to their power intensiveness and huge computation time. In this paper, a convolutional DWT-based pipelined and tunable VLSI architecture of Daubechies 9/7 and 5/3 DWT filter is presented. Our proposed architecture, which mingles the advantages of convolutional and lifting DWT while discarding their notable disadvantages, is made area and memory efficient by exploiting “Distributed Arithmetic’ (DA) in our own ingenious way. Almost 90% reduction in the memory size than other notable architectures is reported. In our proposed architecture, both the 9/7 and 5/3 DWT filters can be realized with a selection input, “mode”. With the introduction of DA, pipelining and parallelism are easily incorporated into our proposed 1D/2D DWT architectures. The area requirement and critical path delay are reduced to almost 38.3% and 50% than that of the latest remarkable designs. The performance of the proposed VLSI architecture also excels in real-time applications.


2020 ◽  
Vol 17 (9) ◽  
pp. 4235-4238
Author(s):  
R. Rohini ◽  
N. V. Satya Narayana ◽  
Durgesh Nandan

In audio and video signal processing main element is the FIR filter. This paper presents complete information regarding the FIR filters. It also focuses on the design of FIR filters which provide low-area, energy-delay, low-power consumption, high-speed, low critical path, and low complexity. Implementation of FIR filters with different methods like memory-based VLSI architecture, filters for sampling rate conversion, linear phase FIR filters, optimal hybrid form FIR filters, Nyquist filters, hybrid multiplier less FIR filters, low complexity FIR filters, variable partition hybrid form FIR filters, area efficiency FIR filters are discussed in this paper. The objective of this paper to provide all related information regarding FIR filters at one platform.


Author(s):  
Gundugonti Kishore Kumar ◽  
Balaji Narayanam

In this paper, a modified finite impulse response (FIR) filter design has been proposed for the denoising bio-electrical signals like Electrooculography(EOG). The proposed filter architecture uses modified multiplier block, which is implemented using modified Radix-[Formula: see text] arithmetic-based representation for minimizing the multiple constant multiplication and conventional ripple carry adders are replaced with [Formula: see text] compressors. This proposed architecture is implemented by using Radix-[Formula: see text]-based multiplier and [Formula: see text] compressor architectures for achieving better improvement in the critical path delay. The Radix-[Formula: see text]-based arithmetic bit recording is used in order to reduce the design complexity of the multiplication. The proposed architecture significantly reduced the delay when compared to existing and conventional architectures.


Author(s):  
Charanjit Singh ◽  
Balwinder Singh

In this paper, a new high speed control circuit is proposed which will act as a critical path for the data which will go from input to output to improve the performance of wave pipelining circuits The wave pipelining is a method of high performance circuit designs which implements pipelining in logic without the use of intermediate registers. Wave pipelining has been widely used in the past few years with a great deal of significant features in technology and applications. It has the ability to improve speed, efficiency, economy in every aspect which it presents. Wave pipelining is being used in wide range of applications like digital filters, network routers, multipliers, fast convolvers, MODEMs, image processing, control systems, radars and many others. In previous work, the operating speed of the wave-pipelined circuit can be increased by the following three tasks: adjustment of the clock period, clock skew and equalization of path delays. The path-delay equalization task can be done theoretically, but the real challenge is to accomplish it in the presence of various different delays. So, the main objective of this paper is to solve the path delay equalization problem by inserting the control circuit in wave pipelined based circuit which will act as critical path for the data that moves from input to output. The proposed technique is evaluated for DSP applications by designing 4- tap FIR filter using Distributed arithmetic algorithm (DAA). Then comparison of this design is done with 4-tap FIR filter designs using conventional pipelining and non pipelining. The synthesis and simulation results based on Xilinx ISE Navigator 12.3 shows that wave pipelined DAA based filter is faster by a factor of 1.43 compared to non pipelined one and the conventional pipelined filter is faster than non pipelined by factor of 1.61 but at the cost of increased logic utilization by 200 %. So, the wave-pipelined DA filters designed with the proposed control circuit can operate at higher frequency than that of non-pipelined but less than that of pipelined. The gain in speed in pipelined compared to that of wavepipelined is at the cost of increased area and more dissipated power. When latency is considered, wavepipelined design filters with the proposed scheme are having the lowest latency among three schemes designed.


2013 ◽  
Vol 479-480 ◽  
pp. 508-512
Author(s):  
Chin Fa Hsieh ◽  
Tsung Han Tsai

This paper proposes high-speed VLSI architecture for implementing a forward two-dimensional discrete wavelet transform (2D DWT). The architecture is based on 2D DWT mathematical formulae. A pipelined scheme is used to increase the clock rate, which allows its critical path to take only one adder delay. The proposed design enables 100% hardware use and faster computing than other 2D DWT architecture. It is easily extended to multilevel decomposition because of its regular structure. It requires N/2 by N/2 clock cycles for k-level analysis of an N by N image. The proposed architecture was coded in VerilogHDL and verified on a real time platform which uses a CMOS image sensor, a field-programmable gate array (FPGA) and a TFT-LCD panel. In the simulation, the design worked with a clock period of 132.38MHz. It can be used as an independent IP core for various real-time applications.


2015 ◽  
Vol 2015 ◽  
pp. 1-16 ◽  
Author(s):  
Burhan Khurshid ◽  
Roohie Naaz Mir

Generalized parallel counters (GPCs) are used in constructing high speed compressor trees. Prior work has focused on utilizing the fast carry chain and mapping the logic onto Look-Up Tables (LUTs). This mapping is not optimal in the sense that the LUT fabric is not fully utilized. This results in low efficiency GPCs. In this work, we present a heuristic that efficiently maps the GPC logic onto the LUT fabric. We have used our heuristic on various GPCs and have achieved an improvement in efficiency ranging from 33% to 100% in most of the cases. Experimental results using Xilinx 5th-, 6th-, and 7th-generation FPGAs and Stratix IV and V devices from Altera show a considerable reduction in resources utilization and dynamic power dissipation, for almost the same critical path delay. We have also implemented GPC-based FIR filters on 7th-generation Xilinx FPGAs using our proposed heuristic and compared their performance against conventional implementations. Implementations based on our heuristic show improved performance. Comparisons are also made against filters based on integrated DSP blocks and inherent IP cores from Xilinx. The results show that the proposed heuristic provides performance that is comparable to the structures based on these specialized resources.


VLSI technology become one of the most significant and demandable because of the characteristics like device portability, device size, large amount of features, expenditure, consistency, rapidity and many others. Multipliers and Adders place an important role in various digital systems such as computers, process controllers and signal processors in order to achieve high speed and low power. Two input XOR/XNOR gate and 2:1 multiplexer modules are used to design the Hybrid Full adders. The XOR/XNOR gate is the key punter of power included in the Full adder cell. However this circuit increases the delay, area and critical path delay. Hence, the optimum design of the XOR/XNOR is required to reduce the power consumption of the Full adder Cell. So a 6 New Hybrid Full adder circuits are proposed based on the Novel Full-Swing XOR/XNOR gates and a New Gate Diffusion Input (GDI) design of Full adder with high-swing outputs. The speed, power consumption, power delay product and driving capability are the merits of the each proposed circuits. This circuit simulation was carried used cadence virtuoso EDA tool. The simulation results based on the 90nm CMOS process technology model.


2019 ◽  
Vol 28 (09) ◽  
pp. 1950149
Author(s):  
Bahram Rashidi ◽  
Mohammad Abedini

This paper presents efficient lightweight hardware implementations of the complete point multiplication on binary Edwards curves (BECs). The implementations are based on general and special cases of binary Edwards curves. The complete differential addition formulas have the cost of [Formula: see text] and [Formula: see text] for general and special cases of BECs, respectively, where [Formula: see text] and [Formula: see text] denote the costs of a field multiplication, a field squaring and a field multiplication by a constant, respectively. In the general case of BECs, the structure is implemented based on 3 concurrent multipliers. Also in the special case of BECs, two structures by employing 3 and 2 field multipliers are proposed for achieving the highest degree of parallelization and utilization of resources, respectively. The field multipliers are implemented based on the proposed efficient digit–digit polynomial basis multiplier. Two input operands of the multiplier proceed in digit level. This property leads to reduce hardware consumption and critical path delay. Also, in the structure, based on the change of input digit size from low digit size to high digit size the number of clock cycles and input words are different. Therefore, the multiplier can be flexible for different cryptographic considerations such as low-area and high-speed implementations. The point multiplication computation requires field inversion, therefore, we use a low-cost Extended Euclidean Algorithm (EEA) based inversion for implementation of this field operation. Implementation results of the proposed architectures based on Virtex-5 XC5VLX110 FPGA for two fields [Formula: see text] and [Formula: see text] are achieved. The results show improvements in terms of area and efficiency for the proposed structures compared to previous works.


Sign in / Sign up

Export Citation Format

Share Document