Improving Amharic Speech Recognition System Using Connectionist Temporal Classification with Attention Model and Phoneme-Based Byte-Pair-Encodings

Eshete Derb Emiru; Shengwu Xiong; Yaxing Li; Awet Fesseha; Moussa Diallo

doi:10.3390/info12020062

Improving Amharic Speech Recognition System Using Connectionist Temporal Classification with Attention Model and Phoneme-Based Byte-Pair-Encodings

Information ◽

10.3390/info12020062 ◽

2021 ◽

Vol 12 (2) ◽

pp. 62 ◽

Cited By ~ 1

Author(s):

Eshete Derb Emiru ◽

Shengwu Xiong ◽

Yaxing Li ◽

Awet Fesseha ◽

Moussa Diallo

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Data Augmentation ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Attention Model ◽

Recognition Systems ◽

End To End ◽

Connectionist Temporal Classification

Out-of-vocabulary (OOV) words are the most challenging problem in automatic speech recognition (ASR), especially for morphologically rich languages. Most end-to-end speech recognition systems are performed at word and character levels of a language. Amharic is a poorly resourced but morphologically rich language. This paper proposes hybrid connectionist temporal classification with attention end-to-end architecture and a syllabification algorithm for Amharic automatic speech recognition system (AASR) using its phoneme-based subword units. This algorithm helps to insert the epithetic vowel እ[ɨ], which is not included in our Grapheme-to-Phoneme (G2P) conversion algorithm developed using consonant–vowel (CV) representations of Amharic graphemes. The proposed end-to-end model was trained in various Amharic subwords, namely characters, phonemes, character-based subwords, and phoneme-based subwords generated by the byte-pair-encoding (BPE) segmentation algorithm. Experimental results showed that context-dependent phoneme-based subwords tend to result in more accurate speech recognition systems than the character-based, phoneme-based, and character-based subword counterparts. Further improvement was also obtained in proposed phoneme-based subwords with the syllabification algorithm and SpecAugment data augmentation technique. The word error rate (WER) reduction was 18.38% compared to character-based acoustic modeling with the word-based recurrent neural network language modeling (RNNLM) baseline. These phoneme-based subword models are also useful to improve machine and speech translation tasks.

Download Full-text

Speech Vision: An End-to-End Deep Learning-based Dysarthric Automatic Speech Recognition System

IEEE Transactions on Neural Systems and Rehabilitation Engineering ◽

10.1109/tnsre.2021.3076778 ◽

2021 ◽

pp. 1-1

Author(s):

Seyed Reza Shahamiri

Keyword(s):

Deep Learning ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

End To End

Download Full-text

Triphone Model Based Novel Kannada Continuous Speech Recognition System using Kaldi Tool

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7210.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 452-458

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Acoustic Model ◽

Mel Frequency Cepstral Coefficients ◽

Automatic Speech Recognition System ◽

Data Set ◽

Acoustic Models ◽

Recognition Systems

Accent is one of the issue for speech recognition systems. Automatic Speech Recognition systems must yield high performance for different dialects. In this work, Neutral Kannada Automatic Speech Recognition is implemented using Kaldi software for monophone modelling and triphone modeling. The acoustic models are constructed using the techniques such as monophone, triphone1, triphone2, triphone3. In triphone modeling, grouping of interphones is performed. Feature extraction is performed by Mel Frequency Cepstral Coefficients. The system performance is analysed by measuring Word Error Rate using different acoustic models. To know the robustness and performance of the Neutral Kannada Automatic Speech Recognition system for different dialects in Kannada, the system is tested for North Kannada accent. Better sentence accuracy is obtained for Neutral Kannada Automatic Speech Recognition system and is about 90%. The performance is degraded, when tested for North Kannada accent and the accuracy obtained is around 77%. The performance is degraded due to the increasing mismatch between the training and testing data set, as the Neutral Kannada Automatic Speech Recognition system is trained only for neutral Kannada acoustic model and doesn't include north Kannada acoustic model. Interactive Kannada voice response system is implemented to identify continuous Kannada speech sentences.

Download Full-text

A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge

10.21437/interspeech.2019-2263 ◽

2019 ◽

Author(s):

Tomohiro Tanaka ◽

Ryo Masumura ◽

Takafumi Moriya ◽

Takanobu Oba ◽

Yushi Aono

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

End To End

Download Full-text

TRIBUS: An end-to-end automatic speech recognition system for European Portuguese

10.21437/iberspeech.2021-40 ◽

2021 ◽

Author(s):

Carlos Carvalho ◽

Alberto Abad

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

European Portuguese ◽

End To End

Download Full-text

“Spanish Políglota”: an automatic Speech Recognition system based on HMM

2021 Second International Conference on Information Systems and Software Technologies (ICI2ST) ◽

10.1109/ici2st51859.2021.00011 ◽

2021 ◽

Author(s):

Jonathan A. Zea ◽

Josafa Aguiar

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

Development of a Generalized Voice-Controlled Human-Robot Interface: One Automatic Speech Recognition System for All Robots

2020 3rd International Conference on Control and Robots (ICCR) ◽

10.1109/iccr51572.2020.9344123 ◽

2020 ◽

Author(s):

Warat Khaewratana ◽

Elizabeth S. Veinott ◽

S. Manian Ramkumar

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

An adaptive soft voice activity detector for automatic speech recognition system

2011 8th International Conference on Information, Communications & Signal Processing ◽

10.1109/icics.2011.6174267 ◽

2011 ◽

Author(s):

Peng Dai ◽

Ing Yann Soon

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Voice Activity Detector ◽

Voice Activity

Download Full-text

Discriminative Training using Heterogeneous Feature Vector for Hindi Automatic Speech Recognition System

2017 International Conference on Computer and Applications (ICCA) ◽

10.1109/comapp.2017.8079777 ◽

2017 ◽

Cited By ~ 3

Author(s):

Mohit Dua ◽

Rajesh Kumar Aggarwal ◽

Mantosh Biswas

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Feature Vector ◽

Recognition System ◽

Discriminative Training ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Heterogeneous Feature

Download Full-text

BART Based Semantic Correction for Mandarin Automatic Speech Recognition System

10.21437/interspeech.2021-739 ◽

2021 ◽

Author(s):

Yun Zhao ◽

Xuerui Yang ◽

Jinchao Wang ◽

Yongyu Gao ◽

Chao Yan ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

SGMM-Based Modeling Classifier for Punjabi Automatic Speech Recognition System

Advances in Intelligent Systems and Computing - Smart Computing Paradigms: New Progresses and Challenges ◽

10.1007/978-981-13-9680-9_12 ◽

2019 ◽

pp. 149-155

Author(s):

Virender Kadyan ◽

Mandeep Kaur

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text