scholarly journals Moving Learning Machine towards Fast Real-Time Applications: A High-Speed FPGA-Based Implementation of the OS-ELM Training Algorithm

Electronics ◽  
2018 ◽  
Vol 7 (11) ◽  
pp. 308 ◽  
Author(s):  
Jose V.  Frances-Villora ◽  
Alfredo Rosado-Muñoz ◽  
Manuel  Bataller-Mompean ◽  
Juan  Barrios-Aviles ◽  
Juan F.  Guerrero-Martinez

Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neural network for each new incoming training input vector. Applying this restriction, the computational needs are drastically reduced. An FPGA-based implementation of the tailored OS-ELM algorithm is used to analyze, in a parameterized way, the level of optimization achieved. We observed that the tailored algorithm drastically reduces the number of clock cycles consumed for the training execution up to approximately the 1%. This performance enables high-speed sequential training ratios, such as 14 KHz of sequential training frequency for a 40 hidden neurons SLFN, or 180 Hz of sequential training frequency for a 500 hidden neurons SLFN. In practice, the proposed implementation computes the training almost 100 times faster, or more, than other applications in the bibliography. Besides, clock cycles follows a quadratic complexity O ( N ˜ 2 ) , with N ˜ the number of hidden neurons, and are poorly influenced by the number of input neurons. However, it shows a pronounced sensitivity to data type precision even facing small-size problems, which force to use double floating-point precision data types to avoid finite precision arithmetic effects. In addition, it has been found that distributed memory is the limiting resource and, thus, it can be stated that current FPGA devices can support OS-ELM-based on-chip learning of up to 500 hidden neurons. Concluding, the proposed hardware implementation of the OS-ELM offers great possibilities for on-chip learning in portable systems and real-time applications where frequent and fast training is required.

Author(s):  
David R. Selviah ◽  
Janti Shawash

This chapter celebrates 50 years of first and higher order neural network (HONN) implementations in terms of the physical layout and structure of electronic hardware, which offers high speed, low latency, compact, low cost, low power, mass produced systems. Low latency is essential for practical applications in real time control for which software implementations running on CPUs are too slow. The literature review chapter traces the chronological development of electronic neural networks (ENN) discussing selected papers in detail from analog electronic hardware, through probabilistic RAM, generalizing RAM, custom silicon Very Large Scale Integrated (VLSI) circuit, Neuromorphic chips, pulse stream interconnected neurons to Application Specific Integrated circuits (ASICs) and Zero Instruction Set Chips (ZISCs). Reconfigurable Field Programmable Gate Arrays (FPGAs) are given particular attention as the most recent generation incorporate Digital Signal Processing (DSP) units to provide full System on Chip (SoC) capability offering the possibility of real-time, on-line and on-chip learning.


2014 ◽  
Vol 1061-1062 ◽  
pp. 1186-1189
Author(s):  
Ming Zhe Wei ◽  
Wan Wei Tang

With the rapid development of aerial UAV (Unmanned Aerial Vehicle), the design of real-time data acquisition and transmission system for the video signal has a new applied field. It is different from traditional video acquisition and processing system, aerial video signal has the problems of screen jitter and spatial interference. The processing algorithm of aerial UAV airborne video signal is put forward in the paper, and the platform of high speed procession is constructed based on chip TMS320DM642, and get a good effect.


Micromachines ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1196
Author(s):  
Samuel da Silva Oliveira ◽  
Bruno Motta de Carvalho ◽  
Márcio Eduardo Kreutz

Network-on-Chip is a good approach to working on intra-chip communication. Networks with irregular topologies may be better suited for specific applications because of their architectural nature. A good design space exploration can help the design of the network to obtain more optimized topologies. This paper proposes a way of optimizing networks with irregular topologies through the use of a genetic algorithm. The network proposed here has heterogeneous routers that aim to optimize the network and support applications with real-time tasks. The goal is to find networks that are optimized for average latency and percentage of real-time packets delivered within the deadline. The results show that we have been able to find networks that can deliver all the real-time packets, obtain acceptable latency values, and shrink the chip area.


Sign in / Sign up

Export Citation Format

Share Document