scholarly journals Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 396
Author(s):  
Robert Stewart ◽  
Andrew Nowlan ◽  
Pascal Bacchus ◽  
Quentin Ducasse ◽  
Ekaterina Komendantskaya

This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62× speedup.

2019 ◽  
Vol 31 (2) ◽  
Author(s):  
Sean Pennefather ◽  
Karen Bradshaw ◽  
Barry Irwin

We present the design and implementation of an indirect messaging extension for the existing NFComms framework that provides communication between a network flow processor and host CPU. This extension addresses the bulk throughput limitations of the framework and is intended to work in conjunction with existing communication mediums. Testing of the framework extensions shows an increase in throughput performance of up to 268x that of the current direct message passing framework at the cost of increased single message latency of up to 2x. This trade-off is considered acceptable as the proposed extensions are intended for bulk data transfer only while the existing message passing functionality of the framework is preserved and can be used in situations where low latency is required for small messages.


Author(s):  
Xiao Wang ◽  
Siyue Wang ◽  
Pin-Yu Chen ◽  
Yanzhi Wang ◽  
Brian Kulis ◽  
...  

Despite achieving remarkable success in various domains, recent studies have uncovered the vulnerability of deep neural networks to adversarial perturbations, creating concerns on model generalizability and new threats such as prediction-evasive misclassification or stealthy reprogramming. Among different defense proposals, stochastic network defenses such as random neuron activation pruning or random perturbation to layer inputs are shown to be promising for attack mitigation. However, one critical drawback of current defenses is that the robustness enhancement is at the cost of noticeable performance degradation on legitimate data, e.g., large drop in test accuracy.This paper is motivated by pursuing for a better trade-off between adversarial robustness and test accuracy for stochastic network defenses. We propose Defense Efficiency Score (DES), a comprehensive metric that measures the gain in unsuccessful attack attempts at the cost of drop in test accuracy of any defense. To achieve a better DES, we propose hierarchical random switching (HRS), which protects neural networks through a novel randomization scheme. A HRS-protected model contains several blocks of randomly switching channels to prevent adversaries from exploiting fixed model structures and parameters for their malicious purposes. Extensive experiments show that HRS is superior in defending against state-of-the-art white-box and adaptive adversarial misclassification attacks. We also demonstrate the effectiveness of HRS in defending adversarial reprogramming, which is the first defense against adversarial programs. Moreover, in most settings the average DES of HRS is at least 5X higher than current stochastic network defenses, validating its significantly improved robustness-accuracy trade-off.


2020 ◽  
Vol 4 (02) ◽  
pp. 34-45
Author(s):  
Naufal Dzikri Afifi ◽  
Ika Arum Puspita ◽  
Mohammad Deni Akbar

Shift to The Front II Komplek Sukamukti Banjaran Project is one of the projects implemented by one of the companies engaged in telecommunications. In its implementation, each project including Shift to The Front II Komplek Sukamukti Banjaran has a time limit specified in the contract. Project scheduling is an important role in predicting both the cost and time in a project. Every project should be able to complete the project before or just in the time specified in the contract. Delay in a project can be anticipated by accelerating the duration of completion by using the crashing method with the application of linear programming. Linear programming will help iteration in the calculation of crashing because if linear programming not used, iteration will be repeated. The objective function in this scheduling is to minimize the cost. This study aims to find a trade-off between the costs and the minimum time expected to complete this project. The acceleration of the duration of this study was carried out using the addition of 4 hours of overtime work, 3 hours of overtime work, 2 hours of overtime work, and 1 hour of overtime work. The normal time for this project is 35 days with a service fee of Rp. 52,335,690. From the results of the crashing analysis, the alternative chosen is to add 1 hour of overtime to 34 days with a total service cost of Rp. 52,375,492. This acceleration will affect the entire project because there are 33 different locations worked on Shift to The Front II and if all these locations can be accelerated then the duration of completion of the entire project will be effective


Author(s):  
K. Maystrenko ◽  
A. Budilov ◽  
D. Afanasev

Goal. Identify trends and prospects for the development of radar in terms of the use of convolutional neural networks for target detection. Materials and methods. Analysis of relevant printed materials related to the subject areas of radar and convolutional neural networks. Results. The transition to convolutional neural networks in the field of radar is considered. A review of papers on the use of convolutional neural networks in pattern recognition problems, in particular, in the radar problem, is carried out. Hardware costs for the implementation of convolutional neural networks are analyzed. Conclusion. The conclusion is made about the need to create a methodology for selecting a network topology depending on the parameters of the radar task.


2020 ◽  
Vol 12 (7) ◽  
pp. 2767 ◽  
Author(s):  
Víctor Yepes ◽  
José V. Martí ◽  
José García

The optimization of the cost and CO 2 emissions in earth-retaining walls is of relevance, since these structures are often used in civil engineering. The optimization of costs is essential for the competitiveness of the construction company, and the optimization of emissions is relevant in the environmental impact of construction. To address the optimization, black hole metaheuristics were used, along with a discretization mechanism based on min–max normalization. The stability of the algorithm was evaluated with respect to the solutions obtained; the steel and concrete values obtained in both optimizations were analyzed. Additionally, the geometric variables of the structure were compared. Finally, the results obtained were compared with another algorithm that solved the problem. The results show that there is a trade-off between the use of steel and concrete. The solutions that minimize CO 2 emissions prefer the use of concrete instead of those that optimize the cost. On the other hand, when comparing the geometric variables, it is seen that most remain similar in both optimizations except for the distance between buttresses. When comparing with another algorithm, the results show a good performance in optimization using the black hole algorithm.


Biomimetics ◽  
2019 ◽  
Vol 5 (1) ◽  
pp. 1 ◽  
Author(s):  
Michelle Gutiérrez-Muñoz ◽  
Astryd González-Salazar ◽  
Marvin Coto-Jiménez

Speech signals are degraded in real-life environments, as a product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions. To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combinations of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation was made based on quality measurements of the signal’s spectrum, the training time of the networks, and statistical validation of results. In total, 120 artificial neural networks of eight different types were trained and compared. The results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, given that reduction in training time is on the order of 30%, in processes that can normally take several days or weeks, depending on the amount of data. The results also present advantages in efficiency, but without a significant drop in quality.


Actuators ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 30
Author(s):  
Pornthep Preechayasomboon ◽  
Eric Rombokas

Soft robotic actuators are now being used in practical applications; however, they are often limited to open-loop control that relies on the inherent compliance of the actuator. Achieving human-like manipulation and grasping with soft robotic actuators requires at least some form of sensing, which often comes at the cost of complex fabrication and purposefully built sensor structures. In this paper, we utilize the actuating fluid itself as a sensing medium to achieve high-fidelity proprioception in a soft actuator. As our sensors are somewhat unstructured, their readings are difficult to interpret using linear models. We therefore present a proof of concept of a method for deriving the pose of the soft actuator using recurrent neural networks. We present the experimental setup and our learned state estimator to show that our method is viable for achieving proprioception and is also robust to common sensor failures.


Author(s):  
Vincent E. Castillo ◽  
John E. Bell ◽  
Diane A. Mollenkopf ◽  
Theodore P. Stank

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1280
Author(s):  
Hyeonseok Lee ◽  
Sungchan Kim

Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jeonghyuk Park ◽  
Yul Ri Chung ◽  
Seo Taek Kong ◽  
Yeong Won Kim ◽  
Hyunho Park ◽  
...  

AbstractThere have been substantial efforts in using deep learning (DL) to diagnose cancer from digital images of pathology slides. Existing algorithms typically operate by training deep neural networks either specialized in specific cohorts or an aggregate of all cohorts when there are only a few images available for the target cohort. A trade-off between decreasing the number of models and their cancer detection performance was evident in our experiments with The Cancer Genomic Atlas dataset, with the former approach achieving higher performance at the cost of having to acquire large datasets from the cohort of interest. Constructing annotated datasets for individual cohorts is extremely time-consuming, with the acquisition cost of such datasets growing linearly with the number of cohorts. Another issue associated with developing cohort-specific models is the difficulty of maintenance: all cohort-specific models may need to be adjusted when a new DL algorithm is to be used, where training even a single model may require a non-negligible amount of computation, or when more data is added to some cohorts. In resolving the sub-optimal behavior of a universal cancer detection model trained on an aggregate of cohorts, we investigated how cohorts can be grouped to augment a dataset without increasing the number of models linearly with the number of cohorts. This study introduces several metrics which measure the morphological similarities between cohort pairs and demonstrates how the metrics can be used to control the trade-off between performance and the number of models.


Sign in / Sign up

Export Citation Format

Share Document