Two-step correction of speech recognition errors based on n-gram and long contextual information

Author(s):  
Ryohei Nakatani ◽  
Tetsuya Takiguchi ◽  
Yasuo Ariki
2020 ◽  
Vol 2020 ◽  
pp. 1-18
Author(s):  
Sonia Setia ◽  
Verma Jyoti ◽  
Neelam Duhan

The continuous growth of the World Wide Web has led to the problem of long access delays. To reduce this delay, prefetching techniques have been used to predict the users’ browsing behavior to fetch the web pages before the user explicitly demands that web page. To make near accurate predictions for users’ search behavior is a complex task faced by researchers for many years. For this, various web mining techniques have been used. However, it is observed that either of the methods has its own set of drawbacks. In this paper, a novel approach has been proposed to make a hybrid prediction model that integrates usage mining and content mining techniques to tackle the individual challenges of both these approaches. The proposed method uses N-gram parsing along with the click count of the queries to capture more contextual information as an effort to improve the prediction of web pages. Evaluation of the proposed hybrid approach has been done by using AOL search logs, which shows a 26% increase in precision of prediction and a 10% increase in hit ratio on average as compared to other mining techniques.


2006 ◽  
Vol 32 (3) ◽  
pp. 417-438 ◽  
Author(s):  
Diane Litman ◽  
Julia Hirschberg ◽  
Marc Swerts

This article focuses on the analysis and prediction of corrections, defined as turns where a user tries to correct a prior error made by a spoken dialogue system. We describe our labeling procedure of various corrections types and statistical analyses of their features in a corpus collected from a train information spoken dialogue system. We then present results of machine-learning experiments designed to identify user corrections of speech recognition errors. We investigate the predictive power of features automatically computable from the prosody of the turn, the speech recognition process, experimental conditions, and the dialogue history. Our best-performing features reduce classification error from baselines of 25.70–28.99% to 15.72%.


2020 ◽  
pp. 1237-1247
Author(s):  
Xiangdong Wang ◽  
Yang Yang ◽  
Hong Liu ◽  
Yueliang Qian ◽  
Duan Jia

In real world applications of speech recognition, recognition errors are inevitable, and manual correction is necessary. This paper presents an approach for the refinement of Mandarin speech recognition result by exploiting user feedback. An interface incorporating character-based candidate lists and feedback-driven updating of the candidate lists is introduced. For dynamic updating of candidate lists, a novel method based on lattice modification and rescoring is proposed. By adding words with similar pronunciations to the candidates next to the corrected character into the lattice and then performing rescoring on the modified lattice, the proposed method can improve the accuracy of the candidate lists even if the correct characters are not in the original lattice, with much lower computational cost than that of the speech re-recognition methods. Experimental results show that the proposed method can reduce 24.03% of user inputs and improve average candidate rank by 25.31%.


Sign in / Sign up

Export Citation Format

Share Document