scholarly journals Approximate LSTM Computing for Energy-Efficient Speech Recognition

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2004
Author(s):  
Junseo Jo ◽  
Jaeha Kung ◽  
Youngjoo Lee

This paper presents an approximate computing method of long short-term memory (LSTM) operations for energy-efficient end-to-end speech recognition. We newly introduce the concept of similarity score, which can measure how much the inputs of two adjacent LSTM cells are similar to each other. Then, we disable the highly-similar LSTM operations and directly transfer the prior results for reducing the computational costs of speech recognition. The pseudo-LSTM operation is additionally defined for providing the approximate computation with reduced processing resolution, which can further relax the processing overheads without degrading the accuracy. In order to verify the proposed idea, in addition, we design an approximate LSTM accelerator in 65 nm CMOS process. The proposed accelerator newly utilizes a number of approximate processing elements (PEs) to support the proposed skipped-LSTM and pseudo-LSTM operations without degrading the energy efficiency. Moreover, sparsity-aware scheduling is introduced by introducing the small-sized on-chip SRAM buffer. As a result, the proposed work provides an energy-efficient but still accurate speech recognition system, which consumes 2.19 times less energy than the baseline architecture.

Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 179 ◽  
Author(s):  
Chongchong Yu ◽  
Yunbing Chen ◽  
Yueqiao Li ◽  
Meng Kang ◽  
Shixuan Xu ◽  
...  

To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.


2021 ◽  
Author(s):  
Wagner I. Penny ◽  
Daniel M. Palomino ◽  
Marcelo S. Porto ◽  
Bruno Zatt

This work presents an energy-efficient NoC-based system for real-time multimedia applications employing approximate computing. The proposed video processing system, called SApp-NoC, is efficient in both energy and quality (QoS), employing a scalable NoC architecture composed of processing elements designed to accelerate the HEVC Fractional Motion Estimation (FME). Two solutions are proposed: HSApp-NoC (Heuristc-based SApp-NoC), and MLSApp-NoC (Machine Learning-based SApp-NoC). When compared to a precise solution processing 4K videos at 120 fps, HSApp-NoC and MLSApp-NoC reduce about 48.19% and 31.81% the energy consumption, at small quality reduction of 2.74% and 1.09%, respectively. Furthermore, a set of schedulability analysis is also proposed in order to guarantee the meeting of timing constraints at typical workload scenarios.


Author(s):  
Yedilkhan Amirgaliyev ◽  
Kuanyshbay Kuanyshbay ◽  
Aisultan Shoiynbek

This paper evaluates and compares the performances of three well-known optimization algorithms (Adagrad, Adam, Momentum) for faster training the neural network of CTC algorithm for speech recognition. For CTC algorithms recurrent neural network has been used, specifically Long-Short-Term memory. LSTM is effective and often used model. Data has been downloaded from VCTK corpus of Edinburgh University. The results of optimization algorithms have been evaluated by the Label error rate and CTC loss.


Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3486
Author(s):  
Jae-Hun Lee ◽  
Dasom Park ◽  
Woojin Cho ◽  
Huu Phan ◽  
Cong Nguyen ◽  
...  

Herein, we present an energy efficient successive-approximation-register (SAR) analog-to-digital converter (ADC) featuring on-chip dual calibration and various accuracy-enhancement techniques. The dual calibration technique is realized in an energy and area-efficient manner for comparator offset calibration (COC) and digital-to-analog converter (DAC) capacitor mismatch calibration. The calibration of common-mode (CM) dependent comparator offset is performed without using separate circuit blocks by reusing the DAC for generating calibration signals. The calibration of the DAC mismatch is efficiently performed by reusing the comparator for delay-based mismatch detection. For accuracy enhancement, we propose new circuit techniques for a comparator, a sampling switch, and a DAC capacitor. An improved dynamic latched comparator is proposed with kick-back suppression and CM dependent offset calibration. An accuracy-enhanced bootstrap sampling switch suppresses the leakage-induced error <180 μV and the sampling error <150 μV. The energy-efficient monotonic switching technique is effectively combined with thermometer coding, which reduces the settling error in the DAC. The ADC is realized using a 0.18 μm complementary metal–oxide–semiconductor (CMOS) process in an area of 0.28 mm2. At the sampling rate fS = 9 kS/s, the proposed ADC achieves a signal-to-noise and distortion ratio (SNDR) of 55.5 dB and a spurious-free dynamic range (SFDR) of 70.6 dB. The proposed dual calibration technique improves the SFDR by 12.7 dB. Consuming 1.15 μW at fS = 200 kS/s, the ADC achieves an SNDR of 55.9 dB and an SFDR of 60.3 dB with a figure-of-merit of 11.4 fJ/conversion-step.


Author(s):  
Deepang Raval ◽  
Vyom Pathak ◽  
Muktan Patel ◽  
Brijesh Bhatt

We present a novel approach for improving the performance of an End-to-End speech recognition system for the Gujarati language. We follow a deep learning-based approach that includes Convolutional Neural Network, Bi-directional Long Short Term Memory layers, Dense layers, and Connectionist Temporal Classification as a loss function. To improve the performance of the system with the limited size of the dataset, we present a combined language model (Word-level language Model and Character-level language model)-based prefix decoding technique and Bidirectional Encoder Representations from Transformers-based post-processing technique. To gain key insights from our Automatic Speech Recognition (ASR) system, we used the inferences from the system and proposed different analysis methods. These insights help us in understanding and improving the ASR system as well as provide intuition into the language used for the ASR system. We have trained the model on the Microsoft Speech Corpus, and we observe a 5.87% decrease in Word Error Rate (WER) with respect to base-model WER.


2011 ◽  
Vol 16 (1) ◽  
pp. 95-99 ◽  
Author(s):  
Hong Liu ◽  
Yanmin Qian ◽  
Jia Liu

2015 ◽  
Vol 24 (10) ◽  
pp. 1550155 ◽  
Author(s):  
Di Zhu ◽  
Liter Siek

This paper presents an energy-efficient and high linearity temperature sensor based on the architecture of a simple on-chip oscillator. A self-calibrated block is proposed to compensate the non-linearities of the on-chip oscillator due to PVT variations. In this manner, this on-chip oscillator-based temperature sensor has superior performance over the conventional inverter-chain-based types. In order to generalize the application, no highly linear temperature coefficient resistors are being utilized. The entire circuit is simple and easy to be scaled down. According to the verifications in 65 nm CMOS process, with one-point calibration, this temperature sensor can achieve an inaccuracy within ±1°C in the temperature range from -55°C to 125°C, with a power consumption of only 0.6 μA under 1.2 V supply voltages.


Sign in / Sign up

Export Citation Format

Share Document