A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models

Mapping Intimacies ◽

10.21437/interspeech.2017-818 ◽

2017 ◽

Cited By ~ 2

Author(s):

Youssef Oualil ◽

Dietrich Klakow

Keyword(s):

Language Models ◽

Large Vocabulary ◽

Batch Noise

Download Full-text

Three probabilistic language models for a large-vocabulary speech recognizer

10.1109/icassp.1988.196632 ◽

2003 ◽

Cited By ~ 9

Author(s):

P. Dumouchel ◽

V. Gupta ◽

M. Lennig ◽

P. Mermelstein

Keyword(s):

Language Models ◽

Large Vocabulary ◽

Speech Recognizer

Download Full-text

Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition

2004 IEEE International Conference on Acoustics, Speech, and Signal Processing ◽

10.1109/icassp.2004.1326093 ◽

2004 ◽

Cited By ~ 13

Author(s):

A. Yazgan ◽

M. Saraclar

Keyword(s):

Speech Recognition ◽

Language Models ◽

Conversational Speech ◽

Large Vocabulary ◽

Hybrid Language ◽

Vocabulary Word ◽

Word Detection

Download Full-text

Statistical language models for large vocabulary spontaneous speech recognition in dutch

10.21437/interspeech.2005-22 ◽

2005 ◽

Author(s):

Jacques Duchateau ◽

Dong Hoon Van Uytsel ◽

Hugo Van Hamme ◽

Patrick Wambacq

Keyword(s):

Speech Recognition ◽

Spontaneous Speech ◽

Language Models ◽

Large Vocabulary ◽

Statistical Language Models

Download Full-text

Strategies for Training Large Vocabulary Neural Language Models

10.18653/v1/p16-1186 ◽

2016 ◽

Cited By ~ 10

Author(s):

Wenlin Chen ◽

David Grangier ◽

Michael Auli

Keyword(s):

Language Models ◽

Large Vocabulary

Download Full-text

POS-based language models for large vocabulary speech recognition on embedded systems

10.21437/interspeech.2005-30 ◽

2005 ◽

Author(s):

Petra Witschel ◽

Sergey Astrov ◽

Gabriele Bakenecker ◽

Josef G. Bauer ◽

Harald Höge

Keyword(s):

Speech Recognition ◽

Embedded Systems ◽

Language Models ◽

Large Vocabulary ◽

Large Vocabulary Speech Recognition

Download Full-text

Creating word-level language models for large-vocabulary handwriting recognition

International Journal on Document Analysis and Recognition (IJDAR) ◽

10.1007/s10032-002-0087-3 ◽

2003 ◽

Vol 5 (2-3) ◽

pp. 126-137 ◽

Cited By ~ 3

Author(s):

John F. Pitrelli ◽

Amit Roy

Keyword(s):

Handwriting Recognition ◽

Language Models ◽

Large Vocabulary ◽

Word Level

Download Full-text

Large vocabulary SOUL neural network language models

10.21437/interspeech.2011-259 ◽

2011 ◽

Author(s):

Hai-Son Le ◽

Ilya Oparin ◽

Abdel Messaoudi ◽

Alexandre Allauzen ◽

Jean-Luc Gauvain ◽

...

Keyword(s):

Neural Network ◽

Language Models ◽

Large Vocabulary ◽

Network Language

Download Full-text

Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition

Computational Intelligence and Neuroscience ◽

10.1155/2019/5072918 ◽

2019 ◽

Vol 2019 ◽

pp. 1-8 ◽

Cited By ~ 4

Author(s):

Edvin Pakoci ◽

Branislav Popović ◽

Darko Pekar

Keyword(s):

Speech Recognition ◽

Language Model ◽

Recognition System ◽

Language Modeling ◽

Error Rates ◽

Language Models ◽

Morphological Data ◽

Semantic Features ◽

Automatic Speech Recognition System ◽

Large Vocabulary

Serbian is in a group of highly inflective and morphologically rich languages that use a lot of different word suffixes to express different grammatical, syntactic, or semantic features. This kind of behaviour usually produces a lot of recognition errors, especially in large vocabulary systems—even when, due to good acoustical matching, the correct lemma is predicted by the automatic speech recognition system, often a wrong word ending occurs, which is nevertheless counted as an error. This effect is larger for contexts not present in the language model training corpus. In this manuscript, an approach which takes into account different morphological categories of words for language modeling is examined, and the benefits in terms of word error rates and perplexities are presented. These categories include word type, word case, grammatical number, and gender, and they were all assigned to words in the system vocabulary, where applicable. These additional word features helped to produce significant improvements in relation to the baseline system, both for n-gram-based and neural network-based language models. The proposed system can help overcome a lot of tedious errors in a large vocabulary system, for example, for dictation, both for Serbian and for other languages with similar characteristics.

Download Full-text

LSTM-Based Language Models for Very Large Vocabulary Continuous Russian Speech Recognition System

Speech and Computer - Lecture Notes in Computer Science ◽

10.1007/978-3-030-26061-3_23 ◽

2019 ◽

pp. 219-226

Author(s):

Irina Kipyatkova

Keyword(s):

Speech Recognition ◽

Recognition System ◽

Language Models ◽

Speech Recognition System ◽

Large Vocabulary

Download Full-text

Large vocabulary speech recognition with multispan statistical language models

IEEE Transactions on Speech and Audio Processing ◽

10.1109/89.817455 ◽

2000 ◽

Vol 8 (1) ◽

pp. 76-84 ◽

Cited By ~ 35

Author(s):

J.R. Bellegarda

Keyword(s):

Speech Recognition ◽

Language Models ◽

Large Vocabulary ◽

Statistical Language Models ◽

Large Vocabulary Speech Recognition

Download Full-text