Approximate LSTM Computing for Energy-Efficient Speech Recognition

Junseo Jo; Jaeha Kung; Youngjoo Lee

doi:10.3390/electronics9122004

Approximate LSTM Computing for Energy-Efficient Speech Recognition

Electronics ◽

10.3390/electronics9122004 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2004

Author(s):

Junseo Jo ◽

Jaeha Kung ◽

Youngjoo Lee

Keyword(s):

Speech Recognition ◽

Energy Efficient ◽

Short Term Memory ◽

Recognition System ◽

Similarity Score ◽

Cmos Process ◽

Approximate Computing ◽

Processing Elements ◽

On Chip ◽

Computing Method

This paper presents an approximate computing method of long short-term memory (LSTM) operations for energy-efficient end-to-end speech recognition. We newly introduce the concept of similarity score, which can measure how much the inputs of two adjacent LSTM cells are similar to each other. Then, we disable the highly-similar LSTM operations and directly transfer the prior results for reducing the computational costs of speech recognition. The pseudo-LSTM operation is additionally defined for providing the approximate computation with reduced processing resolution, which can further relax the processing overheads without degrading the accuracy. In order to verify the proposed idea, in addition, we design an approximate LSTM accelerator in 65 nm CMOS process. The proposed accelerator newly utilizes a number of approximate processing elements (PEs) to support the proposed skipped-LSTM and pseudo-LSTM operations without degrading the energy efficiency. Moreover, sparsity-aware scheduling is introduced by introducing the small-sized on-chip SRAM buffer. As a result, the proposed work provides an energy-efficient but still accurate speech recognition system, which consumes 2.19 times less energy than the baseline architecture.

Download Full-text

Cross-Language End-to-End Speech Recognition Research Based on Transfer Learning for the Low-Resource Tujia Language

Symmetry ◽

10.3390/sym11020179 ◽

2019 ◽

Vol 11 (2) ◽

pp. 179 ◽

Cited By ~ 4

Author(s):

Chongchong Yu ◽

Yunbing Chen ◽

Yueqiao Li ◽

Meng Kang ◽

Shixuan Xu ◽

...

Keyword(s):

Speech Recognition ◽

Transfer Learning ◽

Short Term Memory ◽

Recognition System ◽

Language Recognition ◽

Low Resource ◽

End To End ◽

The Cross ◽

Hidden Layer ◽

Cross Language

To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.

Download Full-text

Energy-Efficient NoC-Based Systems for Real-Time Multimedia Applications using Approximate Computing

10.5753/ctd.2021.15750 ◽

2021 ◽

Author(s):

Wagner I. Penny ◽

Daniel M. Palomino ◽

Marcelo S. Porto ◽

Bruno Zatt

Keyword(s):

Real Time ◽

Video Processing ◽

Energy Efficient ◽

Processing System ◽

Schedulability Analysis ◽

Multimedia Applications ◽

Timing Constraints ◽

Approximate Computing ◽

Processing Elements ◽

Fractional Motion Estimation

This work presents an energy-efficient NoC-based system for real-time multimedia applications employing approximate computing. The proposed video processing system, called SApp-NoC, is efficient in both energy and quality (QoS), employing a scalable NoC architecture composed of processing elements designed to accelerate the HEVC Fractional Motion Estimation (FME). Two solutions are proposed: HSApp-NoC (Heuristc-based SApp-NoC), and MLSApp-NoC (Machine Learning-based SApp-NoC). When compared to a precise solution processing 4K videos at 120 fps, HSApp-NoC and MLSApp-NoC reduce about 48.19% and 31.81% the energy consumption, at small quality reduction of 2.74% and 1.09%, respectively. Furthermore, a set of schedulability analysis is also proposed in order to guarantee the meeting of timing constraints at typical workload scenarios.

Download Full-text

COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska ◽

10.35784/iapgos.234 ◽

2019 ◽

Vol 9 (3) ◽

pp. 54-57

Author(s):

Yedilkhan Amirgaliyev ◽

Kuanyshbay Kuanyshbay ◽

Aisultan Shoiynbek

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Short Term Memory ◽

Optimization Algorithms ◽

Recognition System ◽

Model Data ◽

Short Term ◽

Term Memory ◽

The Neural Network ◽

Long Short Term Memory

This paper evaluates and compares the performances of three well-known optimization algorithms (Adagrad, Adam, Momentum) for faster training the neural network of CTC algorithm for speech recognition. For CTC algorithms recurrent neural network has been used, specifically Long-Short-Term memory. LSTM is effective and often used model. Data has been downloaded from VCTK corpus of Edinburgh University. The results of optimization algorithms have been evaluated by the Label error rate and CTC loss.

Download Full-text

A 1.15 μW 200 kS/s 10-b Monotonic SAR ADC Using Dual On-Chip Calibrations and Accuracy Enhancement Techniques

Sensors ◽

10.3390/s18103486 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3486

Author(s):

Jae-Hun Lee ◽

Dasom Park ◽

Woojin Cho ◽

Huu Phan ◽

Cong Nguyen ◽

...

Keyword(s):

Energy Efficient ◽

Dynamic Range ◽

Sampling Error ◽

Sampling Rate ◽

Oxide Semiconductor ◽

Cmos Process ◽

Calibration Technique ◽

Accuracy Enhancement ◽

On Chip ◽

Comparator Offset

Herein, we present an energy efficient successive-approximation-register (SAR) analog-to-digital converter (ADC) featuring on-chip dual calibration and various accuracy-enhancement techniques. The dual calibration technique is realized in an energy and area-efficient manner for comparator offset calibration (COC) and digital-to-analog converter (DAC) capacitor mismatch calibration. The calibration of common-mode (CM) dependent comparator offset is performed without using separate circuit blocks by reusing the DAC for generating calibration signals. The calibration of the DAC mismatch is efficiently performed by reusing the comparator for delay-based mismatch detection. For accuracy enhancement, we propose new circuit techniques for a comparator, a sampling switch, and a DAC capacitor. An improved dynamic latched comparator is proposed with kick-back suppression and CM dependent offset calibration. An accuracy-enhanced bootstrap sampling switch suppresses the leakage-induced error <180 μV and the sampling error <150 μV. The energy-efficient monotonic switching technique is effectively combined with thermometer coding, which reduces the settling error in the DAC. The ADC is realized using a 0.18 μm complementary metal–oxide–semiconductor (CMOS) process in an area of 0.28 mm2. At the sampling rate fS = 9 kS/s, the proposed ADC achieves a signal-to-noise and distortion ratio (SNDR) of 55.5 dB and a spurious-free dynamic range (SFDR) of 70.6 dB. The proposed dual calibration technique improves the SFDR by 12.7 dB. Consuming 1.15 μW at fS = 200 kS/s, the ADC achieves an SNDR of 55.9 dB and an SFDR of 60.3 dB with a figure-of-merit of 11.4 fJ/conversion-step.

Download Full-text

EERA-ASR: An Energy-Efficient Reconfigurable Architecture for Automatic Speech Recognition With Hybrid DNN and Approximate Computing

IEEE Access ◽

10.1109/access.2018.2870273 ◽

2018 ◽

Vol 6 ◽

pp. 52227-52237 ◽

Cited By ~ 10

Author(s):

Bo Liu ◽

Hai Qin ◽

Yu Gong ◽

Wei Ge ◽

Mengwen Xia ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Energy Efficient ◽

Reconfigurable Architecture ◽

Approximate Computing

Download Full-text

The Implementation of Chinese and English Bilingual Speech Recognition System-on-Chip

International Journal of e-Education e-Business e-Management and e-Learning ◽

10.7763/ijeeee.2013.v3.273 ◽

2013 ◽

Author(s):

Shunli Ding

Keyword(s):

Speech Recognition ◽

Recognition System ◽

System On Chip ◽

Speech Recognition System ◽

English Bilingual ◽

On Chip ◽

Chinese And English

Download Full-text

Improving Deep Learning based Automatic Speech Recognition for Gujarati

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3483446 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-18

Author(s):

Deepang Raval ◽

Vyom Pathak ◽

Muktan Patel ◽

Brijesh Bhatt

Keyword(s):

Deep Learning ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Short Term Memory ◽

Language Model ◽

Recognition System ◽

Processing Technique ◽

Speech Corpus ◽

Novel Approach ◽

Asr System

We present a novel approach for improving the performance of an End-to-End speech recognition system for the Gujarati language. We follow a deep learning-based approach that includes Convolutional Neural Network, Bi-directional Long Short Term Memory layers, Dense layers, and Connectionist Temporal Classification as a loss function. To improve the performance of the system with the limited size of the dataset, we present a combined language model (Word-level language Model and Character-level language model)-based prefix decoding technique and Bidirectional Encoder Representations from Transformers-based post-processing technique. To gain key insights from our Automatic Speech Recognition (ASR) system, we used the inferences from the system and proposed different analysis methods. These insights help us in understanding and improving the ASR system as well as provide intuition into the language used for the ASR system. We have trained the model on the Microsoft Speech Corpus, and we observe a 5.87% decrease in Word Error Rate (WER) with respect to base-model WER.

Download Full-text

English Speech Recognition System on Chip

Tsinghua Science & Technology ◽

10.1016/s1007-0214(11)70015-3 ◽

2011 ◽

Vol 16 (1) ◽

pp. 95-99 ◽

Cited By ~ 2

Author(s):

Hong Liu ◽

Yanmin Qian ◽

Jia Liu

Keyword(s):

Speech Recognition ◽

Recognition System ◽

System On Chip ◽

Speech Recognition System ◽

On Chip

Download Full-text

A novel speech recognition system-on-chip

2008 International Conference on Audio, Language and Image Processing ◽

10.1109/icalip.2008.4590020 ◽

2008 ◽

Author(s):

Haijie Yang ◽

Jing Yao ◽

Jia Liu

Keyword(s):

Speech Recognition ◽

Recognition System ◽

System On Chip ◽

Speech Recognition System ◽

On Chip

Download Full-text

A New Time-Mode On-Chip Oscillator-Based High Linearity and Low Power Temperature Sensor

Journal of Circuits System and Computers ◽

10.1142/s0218126615501558 ◽

2015 ◽

Vol 24 (10) ◽

pp. 1550155 ◽

Cited By ~ 1

Author(s):

Di Zhu ◽

Liter Siek

Keyword(s):

Power Consumption ◽

Temperature Coefficient ◽

Temperature Sensor ◽

Energy Efficient ◽

Superior Performance ◽

Cmos Process ◽

Linear Temperature ◽

High Linearity ◽

Pvt Variations ◽

On Chip

This paper presents an energy-efficient and high linearity temperature sensor based on the architecture of a simple on-chip oscillator. A self-calibrated block is proposed to compensate the non-linearities of the on-chip oscillator due to PVT variations. In this manner, this on-chip oscillator-based temperature sensor has superior performance over the conventional inverter-chain-based types. In order to generalize the application, no highly linear temperature coefficient resistors are being utilized. The entire circuit is simple and easy to be scaled down. According to the verifications in 65 nm CMOS process, with one-point calibration, this temperature sensor can achieve an inaccuracy within ±1°C in the temperature range from -55°C to 125°C, with a power consumption of only 0.6 μA under 1.2 V supply voltages.

Download Full-text