Real-time integration of dynamic context information for improving automatic speech recognition

This chapter introduces the potential of Automatic Speech Recognition Technology (ASR) in the challenge of inclusive education. ASR technology combined with Information and Communication Technology (ICT) enhances the learning of disabled people both in and outside the classroom. In the classroom, deaf and hearing-impaired students can benefit from a real-time transcription of what the teacher is saying. Also, a real-time transcription facilitates note taking for students with visual or physical disabilities. Outside the classroom, transcription and other media files (audio, slides, video, etc.) are powerful educational resources for all students, disabled or able-bodied. Some of most relevant projects and systems around the world are described and compared in this chapter to provide updated information about ASR technology performance and its application to enhancing the learning of disabled students.

Download Full-text

Real-time automatic speech recognition using HMM and neural networks

10.1109/its.1990.175597 ◽

2002 ◽

Author(s):

Y. Arriola ◽

R.A. Carrasco

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition

Download Full-text

Real-Time Bayesian Inference: A Soft Computing Approach to Environmental Learning for On-Line Robust Automatic Speech Recognition

Advances in Intelligent and Soft Computing - Soft Computing Models in Industrial and Environmental Applications, 6th International Conference SOCO 2011 ◽

10.1007/978-3-642-19644-7_47 ◽

2011 ◽

pp. 445-452 ◽

Cited By ~ 1

Author(s):

Md Foezur Rahman Chowdhury ◽

Sid-Ahmed Selouani ◽

Douglas O’Shaughnessy

Keyword(s):

Bayesian Inference ◽

Speech Recognition ◽

Real Time ◽

Soft Computing ◽

Automatic Speech Recognition ◽

Environmental Learning ◽

On Line ◽

Computing Approach

Download Full-text

Automatic Speech Recognition for Real Time Systems

2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM) ◽

10.1109/bigmm.2019.00-26 ◽

2019 ◽

Author(s):

Ranjodh Singh ◽

Hemant Yadav ◽

Mohit Sharma ◽

Sandeep Gosain ◽

Rajiv Ratn Shah

Keyword(s):

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition ◽

Real Time Systems ◽

Time Systems

Download Full-text

Creating accessible educational multimedia through editing automatic speech recognition captioning in real time

Interactive Technology and Smart Education ◽

10.1108/17415650680000058 ◽

2006 ◽

Vol 3 (2) ◽

pp. 131-141 ◽

Cited By ~ 12

Author(s):

Mike Wald

Keyword(s):

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition ◽

Educational Multimedia

Download Full-text

ExKaldi-RT: A Real-Time Automatic Speech Recognition Extension Toolkit of Kaldi

10.1109/gcce53005.2021.9621992 ◽

2021 ◽

Author(s):

Yu Wang ◽

Chee Siang Leow ◽

Akio Kobayashi ◽

Takehito Utsuro ◽

Hiromitsu Nishizaki

Keyword(s):

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition

Download Full-text

An Evaluation of Expedited Transcription Methods for School-Age Children's Narrative Language: Automatic Speech Recognition and Real-Time Transcription

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-21-00096 ◽

2021 ◽

pp. 1-16

Author(s):

Carly B. Fox ◽

Megan Israelsen-Augenstein ◽

Sharad Jones ◽

Sandra Laing Gillam

Keyword(s):

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition ◽

Clinical Utility ◽

Language Disorder ◽

Speech Rate ◽

School Age Children ◽

School Age ◽

Speech Transcription ◽

Narrative Language

Purpose This study examined the accuracy and potential clinical utility of two expedited transcription methods for narrative language samples elicited from school-age children (7;5–11;10 [years;months]) with developmental language disorder. Transcription methods included real-time transcription produced by speech-language pathologists (SLPs) and trained transcribers (TTs) as well as Google Cloud Speech automatic speech recognition. Method The accuracy of each transcription method was evaluated against a gold-standard reference corpus. Clinical utility was examined by determining the reliability of scores calculated from the transcripts produced by each method on several language sample analysis (LSA) measures. Participants included seven certified SLPs and seven TTs. Each participant was asked to produce a set of six transcripts in real time, out of a total 42 language samples. The same 42 samples were transcribed using Google Cloud Speech. Transcription accuracy was evaluated through word error rate. Reliability of LSA scores was determined using correlation analysis. Results Results indicated that Google Cloud Speech was significantly more accurate than real-time transcription in transcribing narrative samples and was not impacted by speech rate of the narrator. In contrast, SLP and TT transcription accuracy decreased as a function of increasing speech rate. LSA metrics generated from Google Cloud Speech transcripts were also more reliably calculated. Conclusions Automatic speech recognition showed greater accuracy and clinical utility as an expedited transcription method than real-time transcription. Though there is room for improvement in the accuracy of speech recognition for the purpose of clinical transcription, it produced highly reliable scores on several commonly used LSA metrics. Supplemental Material https://doi.org/10.23641/asha.15167355

Download Full-text

Hardware–Software Codesign of Automatic Speech Recognition System for Embedded Real-Time Applications

IEEE Transactions on Industrial Electronics ◽

10.1109/tie.2009.2022520 ◽

2011 ◽

Vol 58 (3) ◽

pp. 850-859 ◽

Cited By ~ 29

Author(s):

Octavian Cheng ◽

Waleed Abdulla ◽

Zoran Salcic

Keyword(s):

Speech Recognition ◽

Real Time ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Real Time Applications

Download Full-text