scholarly journals Area Optimisation for Field-Programmable Gate Arrays in SystemC Hardware Compilation

2008 ◽  
Vol 2008 ◽  
pp. 1-14
Author(s):  
Johan Ditmar ◽  
Steve McKeever ◽  
Alex Wilson

This paper discusses a pair of synthesis algorithms that optimise a SystemC design to minimise area when targeting FPGAs. Each can significantly improve the synthesis of a high-level language construct, thus allowing a designer to concentrate more on an algorithm description and less on hardware-specific implementation details. The first algorithm is a source-level transformation implementing function exlining—where a separate block of hardware implements a function and is shared between multiple calls to the function. The second is a novel algorithm for mapping arrays to memories which involves assigning array accesses to memory ports such that no port is ever accessed more than once in a clock cycle. This algorithm assigns accesses to read/write only ports and read-write ports concurrently, solving the assignment problem more efficiently for a wider range of memories compared to existing methods. Both optimisations operate on a high-level program representation and have been implemented in a commercial SystemC compiler. Experiments show that in suitable circumstances these techniques result in significant reductions in logic utilisation for FPGAs.

2005 ◽  
Vol 14 (02) ◽  
pp. 347-366 ◽  
Author(s):  
HAIDAR M. HARMANANI ◽  
RONY SALIBA

This paper presents an evolutionary algorithm to solve the datapath allocation problem in high-level synthesis. The method performs allocation of functional units, registers, and multiplexers in addition to controller synthesis with the objective of minimizing the cost of hardware resources. The system handles multicycle functional units as well as structural pipelining. The proposed method was implemented using C++ on a Linux workstation. We tested our method on a set of high-level synthesis benchmarks, all yielding good solutions in a short time. An integration path to Field Programmable Gate Arrays (FPGAs) is provided through VHDL.


Author(s):  
B. Naresh Kumar Reddy ◽  
N. Suresh ◽  
J.V.N. Ramesh

<p>Programming of Field Programmable Gate Arrays (FPGAs) have long been the domain of engineers with VHDL or Verilog expertise. FPGA’s have caught the attention of algorithm developers and communication researchers, who want to use FPGAs to instantiate systems or implement DSP algorithms. These efforts however, are often stifled by the complexities of programming FPGAs. RTL programming in either VHDL or Verilog is generally not a high level of abstraction needed to represent the world of signal flow graphs and complex signal processing algorithms. This paper describes the FPGA Programs using Graphical Language rather than Verilog, VHDL with the help of LabVIEW and features of the LabVIEW FPGA environment.</p>


2019 ◽  
Vol 08 (03) ◽  
pp. 1950008 ◽  
Author(s):  
Haomiao Wang ◽  
Prabu Thiagaraj ◽  
Oliver Sinnen

Field-Programmable Gate Arrays (FPGAs) are widely used in the central signal processing design of the Square Kilometer Array (SKA) as hardware accelerators. The frequency domain acceleration search (FDAS) module is an important part of the SKA1-MID pulsar search engine. To develop for a yet to be finalized hardware, for cross-discipline interoperability and to achieve fast prototyping, OpenCL as a high-level FPGA synthesis approaches employed to create the sub-modules of FDAS. The FT convolution and the harmonic-summing plus some other minor sub-modules are elements in the FDAS module that have been well-optimized separately before. In this paper, we explore the design space of combining well-optimized designs, dealing with the ensuing need to trade-off and compromise. Pipeline computing is employed to handle multiple input arrays at high speed. The hardware target is to employ multiple high-end FPGAs to process the combined FDAS module. The results show interesting consequences, where the best individual solutions are not necessarily the best solutions for the speed of a pipeline where FPGA resources and memory bandwidth need to be shared. By proposing multiple buffering techniques to the pipeline, the combined FDAS module can achieve up to 2[Formula: see text] speedup over implementations without pipeline computing. We perform an extensive experimental evaluation on multiple high-end FPGA cards hosted in a workstation and compare to a technology comparable mid-range GPU.


Electronics ◽  
2019 ◽  
Vol 8 (5) ◽  
pp. 584 ◽  
Author(s):  
Muhammad Irfan ◽  
Zahid Ullah ◽  
Ray C. C. Cheung

Content-addressable memory (CAM) is a type of associative memory, which returns the address of a given search input in one clock cycle. Many designs are available to emulate the CAM functionality inside the re-configurable hardware, field-programmable gate arrays (FPGAs), using static random-access memory (SRAM) and flip-flops. FPGA-based CAMs are becoming popular due to the rapid growth in software defined networks (SDNs), which uses CAM for packet classification. Emulated designs of CAM consume much dynamic power owing to a high amount of switching activity and computation involved in finding the address of the search key. In this paper, we present a power and resource efficient binary CAM architecture, Zi-CAM, which consumes less power and uses fewer resources than the available architectures of SRAM-based CAM on FPGAs. Zi-CAM consists of two main blocks. RAM block (RB) is activated when there is a sequence of repeating zeros in the input search word; otherwise, lookup tables (LUT) block (LB) is activated. Zi-CAM is implemented on Xilinx Virtex-6 FPGA for the size 64 × 36 which improved power consumption and hardware cost by 30 and 32%, respectively, compared to the available FPGA-based CAMs.


Author(s):  
Naresh Kumar Reddy ◽  
N. Suresh

Programming of Field Programmable Gate Arrays (FPGAs) have long been the domain of engineers with VHDL or Verilog expertise.FPGA’s have caught the attention of algorithm developers and communication researchers, who want to use FPGAs to instantiate systems or implement DSP algorithms. These efforts however, are often stifled by the complexities of programming FPGAs. RTL programming in either VHDL or Verilog is generally not a high level of abstraction needed to represent the world of signal flow graphs and complex signal processing algorithms. This paper describes the FPGA Programs using Graphical Language rather than Verilog, VHDL with the help of LabVIEW and features of the LabVIEW FPGA environment.


Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 926
Author(s):  
Elyas Zamiri ◽  
Alberto Sanchez ◽  
Marina Yushkova ◽  
Maria Sofia Martínez-García ◽  
Angel de Castro

This paper aims to compare different design alternatives of hardware-in-the-loop (HIL) for emulating power converters in Field Programmable Gate Arrays (FPGAs). It proposes various numerical formats (fixed and floating-point) and different approaches (pure VHSIC Hardware Description Language (VHDL), Intellectual Properties (IPs), automated MATLAB HDL code, and High-Level Synthesis (HLS)) to design power converters. Although the proposed models are simple power electronics HIL systems, the idea can be extended to any HIL system. This study compares the design effort of different coding methods and numerical formats considering possible synthesis tools (Precision and Vivado), and it comprises an analytical discussion in terms of area and speed. The different models are synthesized as ad-hoc modules in general-purpose FPGAs, but also using the NI myRIO device as an example of a commercial tool capable of implementing HIL models. The comparison confirms that the optimum design alternative must be chosen based on the application (complexity, frequency, etc.) and designers’ constraints, such as available area, coding expertise, and design effort.


Circuit World ◽  
2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Zeynep Kaya ◽  
Erol Seke

Purpose This paper aims to present a single-block memory-based FFT processor design with a conflict-free addressing scheme for field-programmable gate arrays FPGAs with dual-port block memories. This study aims for a single-block dual-port memory-based N-point radix-2 FFT design that uses memory locations and spending minimum clock cycle. Design/methodology/approach A new memory-based Fast Fourier Transform (FFT) design that uses a dual-port memory block is proposed. Dual-port memory allows the design to perform two memory reads and writes in a single clock cycle. This approach achieves low operational clock and smallest memory simultaneously, excluding some small overhead for exceptional address changes. The methodology is to read from while writing to a memory location, eliminating the need for excess memory and additional clock cycles. Findings With the minimum memory size and the simplest architecture, radix-2 FFT and single-memory block are used. The number of clock pulses spent for all FFT operations does not provide much advantage for low-point FFT operations but is important for high-point FFT operations. With the developed algorithm, N memory is used, and the number of clock pulses spent for all FFT stages is (N/2 +1)log2N for all FFT operations. Originality/value This is an original paper, which has simultaneously in whole or in part been submitted anywhere else.


2017 ◽  
Vol 2017 ◽  
pp. 1-17 ◽  
Author(s):  
David Wilson ◽  
Aniruddha Shastri ◽  
Greg Stitt

Computing systems with field-programmable gate arrays (FPGAs) often achieve fault tolerance in high-energy radiation environments via triple-modular redundancy (TMR) and configuration scrubbing. Although effective, TMR suffers from a 3x area overhead, which can be prohibitive for many embedded usage scenarios. Furthermore, this overhead is often worsened because TMR often has to be applied to existing register-transfer-level (RTL) code that designers created without considering the triplicated resource requirements. Although a designer could redesign the RTL code to reduce resources, modifying RTL schedules and resource allocations is a time-consuming and error-prone process. In this paper, we present a more transparent high-level synthesis approach that uses scheduling and binding to provide attractive tradeoffs between area, performance, and redundancy, while focusing on FPGA implementation considerations, such as resource realization costs, to produce more efficient architectures. Compared to TMR applied to existing RTL, our approach shows resource savings up to 80% with average resource savings of 34% and an average clock degradation of 6%. Compared to the previous approach, our approach shows resource savings up to 74% with average resource savings of 19% and an average heuristic execution time improvement of 96x.


2021 ◽  
Author(s):  
gurwinder singh ◽  
Munish Rattan ◽  
Gurjot Kaur Walia

Abstract The current trend is the combination of chip size reduction and an increase in the number of circuits on chips has provided significant growth in battery consumption and critical energy efficiency leading to growth in the emerging Low Power Electronics sector. Our paper is committed to optimizing the power by eliminating cascading in block RAM. It dominates the amount of power dissipated in SOCs (System on Chips). High-level integration (HLS) allows hardware designers to think logically and not worry about low-level, cyclical details. It arranges the capability to quickly access the slot of design and the tradeoff between resource utilization and operation. Field Programmable Gate Arrays (FP- GAs) show significant progress in measuring speed and capacity to create a platform for the use of digital circuits. In the design of the FPGA, integration tools are used that perform various mitigation and improvement strategies. Integration tools utilize the RTL representation of a project with time constraints and generate a network list of the same level. Today, the advanced Xilinx Vivado Design Suite is used for FPGA design as a blending tool. In some cases, the Xilinx Vivado is unable to meet the required designer delays and power constraints. Therefore the primary goal of this paper is to optimize the power in design constraints in the Xilinx Vivado software.


Sign in / Sign up

Export Citation Format

Share Document