AC_ICAP: A Flexible High Speed ICAP Controller

International Journal of Reconfigurable Computing ◽

10.1155/2015/314358 ◽

2015 ◽

Vol 2015 ◽

pp. 1-15 ◽

Cited By ~ 6

Author(s):

Luis Andres Cardona ◽

Carles Ferrer

Keyword(s):

High Speed ◽

Access Port ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Speed Up ◽

Run Time ◽

Ip Cores ◽

An Autonomous, Self-Authenticating, and Self-Contained Secure Boot Process for Field-Programmable Gate Arrays

Cryptography ◽

10.3390/cryptography2030015 ◽

2018 ◽

Vol 2 (3) ◽

pp. 15 ◽

Cited By ~ 5

Author(s):

Don Owen Jr. ◽

Derek Heeger ◽

Calvin Chan ◽

Wenjie Che ◽

Fareena Saqib ◽

...

Keyword(s):

Flash Memory ◽

Ring Oscillator ◽

Access Port ◽

Gate Arrays ◽

Start Up ◽

Non Volatile Memory ◽

Field Programmable ◽

Programmable Gate Arrays ◽

On Chip ◽

Internal Configuration

Secure booting within a field-programmable gate array (FPGA) environment is traditionally implemented using hardwired embedded cryptographic primitives and non-volatile memory (NVM)-based keys, whereby an encrypted bitstream is decrypted as it is loaded from an external storage medium, e.g., Flash memory. A novel technique is proposed in this paper that self-authenticates an unencrypted FPGA configuration bitstream loaded into the FPGA during the start-up. The internal configuration access port (ICAP) interface is accessed to read out configuration information of the unencrypted bitstream, which is then used as input to a secure hash function SHA-3 to generate a digest. In contrast to conventional authentication, where the digest is computed and compared with a second pre-computed value, we use the digest as a challenge to a hardware-embedded delay physical unclonable function (PUF) called HELP. The delays of the paths sensitized by the challenges are used to generate a decryption key using the HELP algorithm. The decryption key is used in the second stage of the boot process to decrypt the operating system (OS) and applications. It follows that any type of malicious tampering with the unencrypted bitstream changes the challenges and the corresponding decryption key, resulting in key regeneration failure. A ring oscillator is used as a clock to make the process autonomous (and unstoppable), and a novel on-chip time-to-digital-converter is used to measure path delays, making the proposed boot process completely self-contained, i.e., implemented entirely within the re-configurable fabric and without utilizing any vendor-specific FPGA features.

Download Full-text

VR-ZYCAP: A Versatile Resourse-Level ICAP Controller for ZYNQ SOC

Electronics ◽

10.3390/electronics10080899 ◽

2021 ◽

Vol 10 (8) ◽

pp. 899

Author(s):

Bushra Sultana ◽

Anees Ullah ◽

Arsalan Ali Malik ◽

Ali Zahir ◽

Pedro Reviriego ◽

...

Keyword(s):

State Of The Art ◽

Fold Increase ◽

Programmable Logic ◽

Single Chip ◽

Access Port ◽

Fine Grain ◽

Art Works ◽

Run Time ◽

Internal Configuration ◽

Run Time Reconfiguration

Hybrid architectures integrating a processor with an SRAM-based FPGA fabric—for example, Xilinx ZynQ SoC—are increasingly being used as a single-chip solution in several market segments to replace multi-chip designs. These devices not only provide advantages in terms of logic density, cost and integration, but also provide run-time in-field reconfiguration capabilities. However, the current reconfiguration capabilities provided by vendor tools are limited to the module level. Therefore, incremental run-time configuration memory changes require a lengthy compilation time for off-line bitstream generation along with storage and reconfiguration time overheads with traditional vendor methodologies. In this paper, an internal configuration access port (ICAP) controller that provides a versatile fine-grain resource-level incremental reconfiguration of the programmable logic (PL) resources in ZynQ SoC is presented. The proposed controller implemented in PL, called VR-ZyCAP, can reconfigure look-up tables (LUTs) and Flip-Flops (FF). The run-time reconfiguration of FF is achieved through a reset after reconfiguration (RAR)-featured partial bitstream to avoid the unintended state corruption of other memory elements. Along with versatility, our proposed controller improves the reconfiguration time by 30 times for FFs compared to state-of-the-art works while achieving a nearly 400-fold increase in speed for LUTs when compared to vendor-supported software approaches. In addition, it achieves competitive resource utilization when compared to existing approaches.

Download Full-text

FPGA-Based Reliable Fault Secure Design for Protection against Single and Multiple Soft Errors

Electronics ◽

10.3390/electronics9122064 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2064

Author(s):

Manar N. Shaker ◽

Ahmed Hussien ◽

Gehad I. Alkady ◽

Hassanein H. Amer ◽

Ihab Adly

Keyword(s):

Error Detection ◽

System Reliability ◽

Fault Tolerant ◽

Dynamic Partial Reconfiguration ◽

Access Port ◽

Gate Arrays ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Modular Redundancy ◽

Internal Configuration

Field programmable gate arrays (FPGAs) are increasingly used in industry (e.g., biomedical, space, and automotive industries). FPGAs are subjected to single, as well as multiple event upsets (SEUs and MEUs), due to the continuous shrinking of transistor dimensions. These upsets inevitably decrease system lifetime. Fault-tolerant techniques are often used to mitigate these problems. In this research, penta and hexa modular redundancy, as well as dynamic partial reconfiguration (DPR), are used to increase system reliability. We show, depending on the relative rates of the SEUs and MEUs, that penta modular redundancy has a higher reliability than hexa modular redundancy, which is a counter-intuitive result in some cases since increasing redundancy is expected to increase reliability. Focusing on penta modular redundancy, an error detection and recovery mechanism (voter) is designed. This mechanism uses the internal configuration access port (ICAP) and its associated controller, as well as DPR to mitigate SEUs and MEUs. Then, it is implemented on Xilinx Vivado tools targeting the Kintex7 7k410tfbg676 device. Finally, we show how to render this design fault secure in the event that SEUs or MEUs affect the voter itself. This fault secure voter either produces the correct output or gives an indication that the output is incorrect.

Download Full-text

High Speed Homology Search Using Run-Time Reconfiguration

Lecture Notes in Computer Science - Field-Programmable Logic and Applications: Reconfigurable Computing Is Going Mainstream ◽

10.1007/3-540-46117-5_30 ◽

2002 ◽

pp. 281-291 ◽

Cited By ~ 10

Author(s):

Yoshiki Yamaguchi ◽

Yosuke Miyajima ◽

Tsutomu Maruyama ◽

Akihiko Konagaya

Keyword(s):

High Speed ◽

Homology Search ◽

Run Time ◽

Run Time Reconfiguration

Download Full-text

A Primer for Telemetry Interfacing in Accordance with NASA Standards Using Low Cost FPGAs

Journal of Astronomical Instrumentation ◽

10.1142/s225117171640002x ◽

2016 ◽

Vol 05 (01) ◽

pp. 1640002

Author(s):

Jake McCoy ◽

Ted Schultz ◽

James Tutt ◽

Thomas Rogers ◽

Drew Miles ◽

...

Keyword(s):

High Speed ◽

Low Cost ◽

Photon Counting ◽

Sounding Rocket ◽

Photon Counting Detector ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Custom Hardware ◽

Commercial Off The Shelf ◽

Main Component

Photon counting detector systems on sounding rocket payloads often require interfacing asynchronous outputs with a synchronously clocked telemetry (TM) stream. Though this can be handled with an on-board computer, there are several low cost alternatives including custom hardware, microcontrollers and field-programmable gate arrays (FPGAs). This paper outlines how a TM interface (TMIF) for detectors on a sounding rocket with asynchronous parallel digital output can be implemented using low cost FPGAs and minimal custom hardware. Low power consumption and high speed FPGAs are available as commercial off-the-shelf (COTS) products and can be used to develop the main component of the TMIF. Then, only a small amount of additional hardware is required for signal buffering and level translating. This paper also discusses how this system can be tested with a simulated TM chain in the small laboratory setting using FPGAs and COTS specialized data acquisition products.

Download Full-text

BPR-TCAM—Block and Partial Reconfiguration based TCAM on Xilinx FPGAs

Electronics ◽

10.3390/electronics9020353 ◽

2020 ◽

Vol 9 (2) ◽

pp. 353 ◽

Cited By ~ 1

Author(s):

Anees Ullah ◽

Ali Zahir ◽

Noaman A. Khan ◽

Waleed Ahmad ◽

Alexis Ramos ◽

...

Keyword(s):

Resource Utilization ◽

High Speed ◽

State Of The Art ◽

Field Programmable Gate Arrays ◽

Partial Reconfiguration ◽

Gate Arrays ◽

Content Addressable Memories ◽

Field Programmable ◽

Programmable Gate Arrays

Field Programmable Gate Arrays (FPGAs) based Ternary Content Addressable Memories (TCAMs) are widely used in high-speed networking applications.However, TCAMs are not present on state-of-the-art FPGAs and need to be emulated on SRAM-based memories (i.e., LUTRAMs and Block RAMs) which requires a large amount of FPGA resources. In this paper, we present an efficient methodology to implement FPGA-based TCAMs with significant resource savings compared to existing schemes. The proposed methodology exploits the fracturable nature of Look Up Tables (LUTs) and the built-in slice carry-chains for simultaneous mapping of two rules and its matching logic to a single FPGA slice. Multiple slices can be stacked together to build deeper and wider TCAMs in a modular way. The combination of all these techniques results in significant savings in resource utilization compared to existing approaches.

Download Full-text

Nonvolatile Nanoelectromechanical Memory Switches for Low-Power and High-Speed Field-Programmable Gate Arrays

IEEE Transactions on Electron Devices ◽

10.1109/ted.2014.2380992 ◽

2015 ◽

Vol 62 (2) ◽

pp. 673-679 ◽

Cited By ~ 14

Author(s):

Yong Jun Kim ◽

Woo Young Choi

Keyword(s):

Low Power ◽

High Speed ◽

Field Programmable Gate Arrays ◽

Gate Arrays ◽

Field Programmable ◽

Programmable Gate Arrays

Download Full-text

Implementation of Embedded Floating Point Arithmetic Units on FPGA

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.550.126 ◽

2014 ◽

Vol 550 ◽

pp. 126-136

Author(s):

N. Ramya Rani

Keyword(s):

High Speed ◽

High Performance ◽

Floating Point ◽

Double Precision ◽

Embedded Computing ◽

Floating Point Arithmetic ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Arithmetic Units ◽

Point Arithmetic

:Floating point arithmetic plays a major role in scientific and embedded computing applications. But the performance of field programmable gate arrays (FPGAs) used for floating point applications is poor due to the complexity of floating point arithmetic. The implementation of floating point units on FPGAs consumes a large amount of resources and that leads to the development of embedded floating point units in FPGAs. Embedded applications like multimedia, communication and DSP algorithms use floating point arithmetic in processing graphics, Fourier transformation, coding, etc. In this paper, methodologies are presented for the implementation of embedded floating point units on FPGA. The work is focused with the aim of achieving high speed of computations and to reduce the power for evaluating expressions. An application that demands high performance floating point computation can achieve better speed and density by incorporating embedded floating point units. Additionally this paper describes a comparative study of the design of single precision and double precision pipelined floating point arithmetic units for evaluating expressions. The modules are designed using VHDL simulation in Xilinx software and implemented on VIRTEX and SPARTAN FPGAs.

Download Full-text

Hardware Acceleration of Sparse Support Vector Machines for Edge Computing

Elektronika ir Elektrotechnika ◽

10.5755/j01.eie.26.3.25796 ◽

2020 ◽

Vol 26 (3) ◽

pp. 42-53

Author(s):

Vuk Vranjkovic ◽

Rastislav Struharik

Keyword(s):

Support Vector Machines ◽

Hardware Acceleration ◽

Edge Computing ◽

Support Vector ◽

Vector Machines ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Speed Up ◽

Memory Reduction ◽

Systems With Memory

In this paper, a hardware accelerator for sparse support vector machines (SVM) is proposed. We believe that the proposed accelerator is the first accelerator of this kind. The accelerator is designed for use in field programmable gate arrays (FPGA) systems. Additionally, a novel algorithm for the pruning of SVM models is developed. The pruned SVM model has a smaller memory footprint and can be processed faster compared to dense SVM models. In the systems with memory throughput, compute or power constraints, such as edge computing, this can be a big advantage. The experiments on several standard datasets are conducted, which aim is to compare the efficiency of the proposed architecture and the developed algorithm to the existing solutions. The results of the experiments reveal that the proposed hardware architecture and SVM pruning algorithm has superior characteristics in comparison to the previous work in the field. A memory reduction from 3 % to 85 % is achieved, with a speed-up in a range from 1.17 to 7.92.

Download Full-text

A Comparison of Filtering Approaches Using Low-Speed DACs for Hardware-in-the-Loop Implemented in FPGAs

Electronics ◽

10.3390/electronics8101116 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1116 ◽

Cited By ~ 4

Author(s):

Yushkova ◽

Sanchez ◽

de Castro ◽

Martínez-García

Keyword(s):

High Speed ◽

Hardware In The Loop ◽

Low Speed ◽

Digital To Analog Converters ◽

Gate Arrays ◽

Simulation Techniques ◽

Input Signals ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Input Waveform

The use of Hardware-in-the-Loop (HIL) systems implemented in Field Programmable Gate Arrays (FPGAs) is constantly increasing because of its advantages compared to traditional simulation techniques. This increase in usage has caused new challenges related to the improvement of their performance and features like the number of output channels, while the price of HIL systems is diminishing. At present, the use of low-speed Digital-to-Analog Converters (DACs) is starting to be a commercial possibility because of two reasons. One is their lower price and the other is their lower pin count, which determines the number and price of the FPGAs that are necessary to handle those DACs. This paper compares four filtering approaches for providing suitable data to low-speed DACs, which help to filter high-speed input signals, discarding the need of using expensive high-speed DACS, and therefore decreasing the total cost of HIL implementations. Results show that the selection of the appropriate filter should be based on the type of the input waveform and the relative importance of the dynamics versus the area.

Download Full-text