A 2D Convolutional Gating Mechanism for Mandarin Streaming Speech Recognition

Xintong Wang; Chuangang Zhao

doi:10.3390/info12040165

A 2D Convolutional Gating Mechanism for Mandarin Streaming Speech Recognition

Information ◽

10.3390/info12040165 ◽

2021 ◽

Vol 12 (4) ◽

pp. 165

Author(s):

Xintong Wang ◽

Chuangang Zhao

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Performance Improvement ◽

Time Domain ◽

Contextual Information ◽

Input Feature ◽

Gating Mechanism ◽

Input Layer ◽

Significant Performance ◽

The Time Domain

Recent research shows recurrent neural network-Transducer (RNN-T) architecture has become a mainstream approach for streaming speech recognition. In this work, we investigate the VGG2 network as the input layer to the RNN-T in streaming speech recognition. Specifically, before the input feature is passed to the RNN-T, we introduce a gated-VGG2 block, which uses the first two layers of the VGG16 to extract contextual information in the time domain, and then use a SEnet-style gating mechanism to control what information in the channel domain is to be propagated to RNN-T. The results show that the RNN-T model with the proposed gated-VGG2 block brings significant performance improvement when compared to the existing RNN-T model, and it has a lower latency and character error rate than the Transformer-based model.

Download Full-text

TSTNN: Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413740 ◽

2021 ◽

Author(s):

Kai Wang ◽

Bengbeng He ◽

Wei-Ping Zhu

Keyword(s):

Neural Network ◽

Speech Enhancement ◽

Time Domain ◽

Two Stage ◽

The Time Domain

Download Full-text

A Deep Neural Network Model for Learning Runtime Frequency Response Function Using Sensor Measurements

Volume 2: Manufacturing Processes; Manufacturing Systems; Nano/Micro/Meso Manufacturing; Quality and Reliability ◽

10.1115/msec2021-64065 ◽

2021 ◽

Author(s):

Yongzhi Qu ◽

Gregory W. Vogl ◽

Zechao Wang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Fourier Transform ◽

Time Domain ◽

Frequency Response Function ◽

Frequency Component ◽

Operating Conditions ◽

Input Output ◽

The Fourier Transform ◽

The Time Domain

Abstract The frequency response function (FRF), defined as the ratio between the Fourier transform of the time-domain output and the Fourier transform of the time-domain input, is a common tool to analyze the relationships between inputs and outputs of a mechanical system. Learning the FRF for mechanical systems can facilitate system identification, condition-based health monitoring, and improve performance metrics, by providing an input-output model that describes the system dynamics. Existing FRF identification assumes there is a one-to-one mapping between each input frequency component and output frequency component. However, during dynamic operations, the FRF can present complex dependencies with frequency cross-correlations due to modulation effects, nonlinearities, and mechanical noise. Furthermore, existing FRFs assume linearity between input-output spectrums with varying mechanical loads, while in practice FRFs can depend on the operating conditions and show high nonlinearities. Outputs of existing neural networks are typically low-dimensional labels rather than real-time high-dimensional measurements. This paper proposes a vector regression method based on deep neural networks for the learning of runtime FRFs from measurement data under different operating conditions. More specifically, a neural network based on an encoder-decoder with a symmetric compression structure is proposed. The deep encoder-decoder network features simultaneous learning of the regression relationship between input and output embeddings, as well as a discriminative model for output spectrum classification under different operating conditions. The learning model is validated using experimental data from a high-pressure hydraulic test rig. The results show that the proposed model can learn the FRF between sensor measurements under different operating conditions with high accuracy and denoising capability. The learned FRF model provides an estimation for sensor measurements when a physical sensor is not feasible and can be used for operating condition recognition.

Download Full-text

Densely Connected Neural Network with Dilated Convolutions for Real-Time Speech Enhancement in The Time Domain

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054536 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ashutosh Pandey ◽

DeLiang Wang

Keyword(s):

Neural Network ◽

Real Time ◽

Speech Enhancement ◽

Time Domain ◽

The Time Domain

Download Full-text

Binarized Weight Neural-Network Inspired Ultra-Low Power Speech Recognition Processor with Time-Domain Based Digital-Analog Mixed Approximate Computing

2020 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas45731.2020.9181172 ◽

2020 ◽

Author(s):

Bo Liu ◽

Hao Cai ◽

Yu Gong ◽

Wentao Zhu ◽

Yan Li ◽

...

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Low Power ◽

Time Domain ◽

Approximate Computing ◽

Ultra Low Power

Download Full-text

Fault Detection of Induction Motors Using Fourier and Wavelet Analysis

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2004.p0431 ◽

2004 ◽

Vol 8 (4) ◽

pp. 431-436 ◽

Cited By ~ 3

Author(s):

Hyeon Bae ◽

◽

Youn-Tae Kim ◽

Sungshin Kim ◽

Sang-Hyuk Lee ◽

...

Keyword(s):

Neural Network ◽

Fault Detection ◽

Wavelet Analysis ◽

Time Domain ◽

Induction Motors ◽

Fuzzy Inference ◽

Classification Model ◽

Diagnosis And Prognosis ◽

Signal Features ◽

The Time Domain

The motor is the workhorse of industries. The issues of preventive and condition-based maintenance, online monitoring, system fault detection, diagnosis, and prognosis are of increasing importance. This paper introduces fault detection for induction motors. Stator currents are measured by current meters and stored by time domain. The time domain is not suitable for representing current signals, so the frequency domain is applied to display signals. The Fourier Transform is employed to convert signals. After signal conversion, signal features must be extracted by signal processing such as wavelet and spectrum analysis. Features are entered in a pattern classification model such as a neural network model, a polynomial neural network, or a fuzzy inference model. This paper describes fault detection results that use Fourier and wavelet analysis. This combined approach is very useful and powerful for detecting signal features.

Download Full-text

In-Line Acoustic Device Inspection of Leakage in Water Distribution Pipes Based on Wavelet and Neural Network

Journal of Sensors ◽

10.1155/2017/5789510 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Dileep Kumar ◽

Dezhan Tu ◽

Naifu Zhu ◽

Dibo Hou ◽

Hongjian Zhang

Keyword(s):

Neural Network ◽

Time Domain ◽

Water Distribution ◽

Acoustic Sensors ◽

Mother Wavelet ◽

Long Distance ◽

Proper Position ◽

Detection Techniques ◽

Acoustic Device ◽

The Time Domain

Traditionally permanent acoustic sensors leak detection techniques have been proven to be very effective in water distribution pipes. However, these methods need long distance deployment and proper position of sensors and cannot be implemented on underground pipelines. An inline-inspection acoustic device is developed which consists of acoustic sensors. The device will travel by the flow of water through the pipes which record all noise events and detect small leaks. However, it records all the noise events regarding background noises, but the time domain noisy acoustic signal cannot manifest complete features such as the leak flow rate which does not distinguish the leak signal and environmental disturbance. This paper presents an algorithm structure with the modularity of wavelet and neural network, which combines the capability of wavelet transform analyzing leakage signals and classification capability of artificial neural networks. This study validates that the time domain is not evident to the complete features regarding noisy leak signals and significance of selection of mother wavelet to extract the noise event features in water distribution pipes. The simulation consequences have shown that an appropriate mother wavelet has been selected and localized to extract the features of the signal with leak noise and background noise, and by neural network implementation, the method improves the classification performance of extracted features.

Download Full-text

Sound Visualization and Convolutional Neural Network in Fault Diagnosis of Electric Motorbike

Traitement du signal ◽

10.18280/ts.380626 ◽

2021 ◽

Vol 38 (6) ◽

pp. 1819-1827

Author(s):

Jian-Da Wu ◽

Che-Yuan Hsieh ◽

Wen-Jun Luo

Keyword(s):

Neural Network ◽

Fault Diagnosis ◽

Power Systems ◽

Convolutional Neural Network ◽

Frequency Domain ◽

Time Domain ◽

Electric Motor ◽

Combustion Engine ◽

Electric Power System ◽

The Time Domain

This study proposed convolutional neural network (CNN) training for different figure recognition to diagnose electric motorbike faults. Traditional motorbike maintenance is usually carried out by technicians to find the problem step by step. Many resources are wasted and time consumed in diagnosing maintenance problems. Due to rising environmental protection awareness, motorbike power systems gradually transformed from combustion engines into the electric motor. The sound amplitude generated by the combustion engine is great and may cover other faulty sounds. The electric power system sound amplitude is greatly decreased, permitting various fault diagnosis to be performed by extracting the electric motor sound. With the development of computers and image processing, deep learning neural network for picture recognition technology becomes more feasible. This study presents the motor system sound visualization for fault diagnosis. First obtain the sound signals of the motor in the five different states of the operation in the laboratory and the road test, and draw the time domain graph, frequency domain graph and spectrogram graph to be used as the test database. The results graphs of various states were trained through a CNN. The signal states were then classified to achieve fault diagnosis. Experiments and identification results show that the spectrogram and CNN method can identify motorbike faults most effectively compared to the time domain graph and the frequency domain graph.

Download Full-text

STREMODO, ein innovatives Verfahren zur kontinuierlichen Erfassung der Stressbelastung von Schweinen bei Haltung und Transport

Archives Animal Breeding ◽

10.5194/aab-47-173-2004 ◽

2004 ◽

Vol 47 (2) ◽

pp. 173-181 ◽

Cited By ~ 1

Author(s):

G. Manteuffel ◽

P. C. Schön

Keyword(s):

Neural Network ◽

Time Domain ◽

Automatic System ◽

Time Windows ◽

Emotional States ◽

Stress Assessment ◽

Housing Systems ◽

Distress Calls ◽

The Time Domain ◽

Short Time

Abstract. Title of the paper: STREMODO, an innovative technique for continuous stress assessment of pigs in housing and transport Vocal utterances of animals are the results of emotional states in specific situations. Therefore, distress calls of pigs can be used as indicators of impaired welfare. An automatic system was developed that responds selectively to stress vocalisations and that registrates and records their amount in the time domain. It can be applied in housing systems, during transports and in abattoirs. The patented technique is based on sequential records of the actual sound events in short time windows (92ms) and a parsimonious coding by 12 complex parameters (LPC-coefficients). A subsequent artificial neural network trained with respective parameters from porcine stress vocalisations is able to detect stress utterances with an error rate of less than 5 % even in noisy stables.

Download Full-text

TCNN: Temporal Convolutional Neural Network for Real-time Speech Enhancement in the Time Domain

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2019.8683634 ◽

2019 ◽

Cited By ~ 12

Author(s):

Ashutosh Pandey ◽

DeLiang Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Real Time ◽

Speech Enhancement ◽

Time Domain ◽

The Time Domain

Download Full-text

Convolutional neural network classifier for the output of the time-domain $\mathcal{F}$-statistic all-sky search for continuous gravitational waves

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ab86c7 ◽

2020 ◽

Vol 1 (2) ◽

pp. 025016 ◽

Cited By ~ 3

Author(s):

Filip Morawski ◽

Michał Bejger ◽

Paweł Ciecieląg

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gravitational Waves ◽

Time Domain ◽

Neural Network Classifier ◽

The Time Domain

Download Full-text