Quantization and Deployment of Deep Neural Networks on Microcontrollers

Pierre-Emmanuel Novac; Ghouthi Boukli Hacene; Alain Pegatoquet; Benoît Miramond; Vincent Gripon

doi:10.3390/s21092984

Quantization and Deployment of Deep Neural Networks on Microcontrollers

Sensors ◽

10.3390/s21092984 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2984

Author(s):

Pierre-Emmanuel Novac ◽

Ghouthi Boukli Hacene ◽

Alain Pegatoquet ◽

Benoît Miramond ◽

Vincent Gripon

Keyword(s):

Neural Networks ◽

Low Power ◽

Power Efficiency ◽

Deep Neural Networks ◽

Use Cases ◽

Comparison Study ◽

Power Devices ◽

Embedded Devices ◽

Inference Engines ◽

New Framework

Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks such as speech recognition, object detection or Human Activity Recognition. However, there is still room for optimization of deep neural networks onto embedded devices. These optimizations mainly address power consumption, memory and real-time constraints, but also an easier deployment at the edge. Moreover, there is still a need for a better understanding of what can be achieved for different use cases. This work focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. The quantization methods, relevant in the context of an embedded execution onto a microcontroller, are first outlined. Then, a new framework for end-to-end deep neural networks training, quantization and deployment is presented. This framework, called MicroAI, is designed as an alternative to existing inference engines (TensorFlow Lite for Microcontrollers and STM32Cube.AI). Our framework can indeed be easily adjusted and/or extended for specific use cases. Execution using single precision 32-bit floating-point as well as fixed-point on 8- and 16 bits integers are supported. The proposed quantization method is evaluated with three different datasets (UCI-HAR, Spoken MNIST and GTSRB). Finally, a comparison study between MicroAI and both existing embedded inference engines is provided in terms of memory and power efficiency. On-device evaluation is done using ARM Cortex-M4F-based microcontrollers (Ambiq Apollo3 and STM32L452RE).

Download Full-text

A Power Efficiency Enhancements of a Multi-Bit Accelerator for Memory Prohibitive Deep Neural Networks

IEEE Open Journal of Circuits and Systems ◽

10.1109/ojcas.2020.3047225 ◽

2021 ◽

Vol 2 ◽

pp. 156-169

Author(s):

Suhas Shivapakash ◽

Hardik Jain ◽

Olaf Hellwich ◽

Friedel Gerfers

Keyword(s):

Neural Networks ◽

Power Efficiency ◽

Deep Neural Networks

Download Full-text

A Comparison Study on Legal Document Classification Using Deep Neural Networks

2019 International Conference on Information and Communication Technology Convergence (ICTC) ◽

10.1109/ictc46691.2019.8939926 ◽

2019 ◽

Author(s):

Jihoon Lee ◽

Hyukjoon Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Document Classification ◽

Comparison Study ◽

Legal Document

Download Full-text

Implementation of efficient, low power deep neural networks on next-generation intel client platforms

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.8005304 ◽

2017 ◽

Cited By ~ 2

Author(s):

Michael Deisher ◽

Andrzej Polonski

Keyword(s):

Neural Networks ◽

Low Power ◽

Deep Neural Networks ◽

Next Generation

Download Full-text

Towards Understanding the Risks of Gradient Inversion in Federated Learning

10.21203/rs.3.rs-1147182/v2 ◽

2021 ◽

Author(s):

Ali Hatamizadeh ◽

Hongxu Yin ◽

Pavlo Molchanov ◽

Andriy Myronenko ◽

Wenqi Li ◽

...

Keyword(s):

Neural Networks ◽

Data Privacy ◽

Deep Neural Networks ◽

Differential Privacy ◽

Use Cases ◽

Training Data ◽

Model Accuracy ◽

Healthcare Applications ◽

Raw Data ◽

Collaborative Training

Abstract Federated learning (FL) allows the collaborative training of AI models without needing to share raw data. This capability makes it especially interesting for healthcare applications where patient and data privacy is of utmost concern. However, recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data. In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack that works for more realistic scenarios where the clients’ training involves updating the Batch Normalization (BN) statistics. Furthermore, we present new ways to measure and visualize potential data leakage in FL. Our work is a step towards establishing reproducible methods of measuring data leakage in FL and could help determine the optimal tradeoffs between privacy-preserving techniques, such as differential privacy, and model accuracy based on quantifiable metrics.

Download Full-text

A Low-Power Arithmetic Element for Multi-Base Logarithmic Computation on Deep Neural Networks

2018 31st IEEE International System-on-Chip Conference (SOCC) ◽

10.1109/socc.2018.8618560 ◽

2018 ◽

Cited By ~ 1

Author(s):

Jiawei Xu ◽

Yuxiang Huan ◽

Li-Rong Zheng ◽

Zhuo Zou

Keyword(s):

Neural Networks ◽

Low Power ◽

Deep Neural Networks

Download Full-text

Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

Annual Review of Vision Science ◽

10.1146/annurev-vision-082114-035447 ◽

2015 ◽

Vol 1 (1) ◽

pp. 417-446 ◽

Cited By ~ 295

Author(s):

Nikolaus Kriegeskorte

Keyword(s):

Neural Networks ◽

Information Processing ◽

Deep Neural Networks ◽

Biological Vision ◽

New Framework

Download Full-text

A Low-Power Speech Recognizer and Voice Activity Detector Using Deep Neural Networks

IEEE Journal of Solid-State Circuits ◽

10.1109/jssc.2017.2752838 ◽

2018 ◽

Vol 53 (1) ◽

pp. 66-75 ◽

Cited By ~ 26

Author(s):

Michael Price ◽

James Glass ◽

Anantha P. Chandrakasan

Keyword(s):

Neural Networks ◽

Low Power ◽

Deep Neural Networks ◽

Voice Activity Detector ◽

Speech Recognizer ◽

Voice Activity

Download Full-text

Real-time multi-task diffractive deep neural networks via hardware-software co-design

Scientific Reports ◽

10.1038/s41598-021-90221-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yingjie Li ◽

Ruiyang Chen ◽

Berardi Sensale-Rodriguez ◽

Weilu Gao ◽

Cunxi Yu

Keyword(s):

Neural Networks ◽

Real Time ◽

Power Efficiency ◽

Deep Neural Networks ◽

Design Method ◽

Optical Computing ◽

Domain Specific ◽

Constrained Environments ◽

Task Architecture ◽

Hardware Efficiency

AbstractDeep neural networks (DNNs) have substantial computational requirements, which greatly limit their performance in resource-constrained environments. Recently, there are increasing efforts on optical neural networks and optical computing based DNNs hardware, which bring significant advantages for deep learning systems in terms of their power efficiency, parallelism and computational speed. Among them, free-space diffractive deep neural networks (D2NNs) based on the light diffraction, feature millions of neurons in each layer interconnected with neurons in neighboring layers. However, due to the challenge of implementing reconfigurability, deploying different DNNs algorithms requires re-building and duplicating the physical diffractive systems, which significantly degrades the hardware efficiency in practical application scenarios. Thus, this work proposes a novel hardware-software co-design method that enables first-of-its-like real-time multi-task learning in D22NNs that automatically recognizes which task is being deployed in real-time. Our experimental results demonstrate significant improvements in versatility, hardware efficiency, and also demonstrate and quantify the robustness of proposed multi-task D2NN architecture under wide noise ranges of all system components. In addition, we propose a domain-specific regularization algorithm for training the proposed multi-task architecture, which can be used to flexibly adjust the desired performance for each task.

Download Full-text

Performance of deep neural networks on low-power IoT devices

Proceedings of the Workshop on Benchmarking Cyber-Physical Systems and Internet of Things ◽

10.1145/3458473.3458823 ◽

2021 ◽

Author(s):

Christos Profentzas ◽

Magnus Almgren ◽

Olaf Landsiedel

Keyword(s):

Neural Networks ◽

Low Power ◽

Deep Neural Networks ◽

Iot Devices

Download Full-text

Intelligent control of quad-rotor aircrafts with a STM32 microcontroller using deep neural networks

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-10-2020-0239 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Xiaochun Guan ◽

Sheng Lou ◽

Han Li ◽

Tinglong Tang

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Communication ◽

Efficient Design ◽

Embedded Devices ◽

Content Type ◽

Design Scheme ◽

Communication Module ◽

Aerial Vehicle ◽

Optical Flow Sensor

Purpose Deployment of deep neural networks on embedded devices is becoming increasingly popular because it can reduce latency and energy consumption for data communication. This paper aims to give out a method for deployment the deep neural networks on a quad-rotor aircraft for further expanding its application scope. Design/methodology/approach In this paper, a design scheme is proposed to implement the flight mission of the quad-rotor aircraft based on multi-sensor fusion. It integrates attitude acquisition module, global positioning system position acquisition module, optical flow sensor, ultrasonic sensor and Bluetooth communication module, etc. A 32-bit microcontroller is adopted as the main controller for the quad-rotor aircraft. To make the quad-rotor aircraft be more intelligent, the study also proposes a method to deploy the pre-trained deep neural networks model on the microcontroller based on the software packages of the RT-Thread internet of things operating system. Findings This design provides a simple and efficient design scheme to further integrate artificial intelligence (AI) algorithm for the control system design of quad-rotor aircraft. Originality/value This method provides an application example and a design reference for the implementation of AI algorithms on unmanned aerial vehicle or terminal robots.

Download Full-text