A Hybrid Language Model for Handwritten Chinese Sentence Recognition

Author(s):  
Qizhen He ◽  
Shijie Chen ◽  
Mingxi Zhao ◽  
Wei Lin
2013 ◽  
Author(s):  
Kwanchiva Thangthai ◽  
Ananlada Chotimongkol ◽  
Chai Wutiwiwatchai

2020 ◽  
Vol 3 (2) ◽  
Author(s):  
Mercedes Martínez-Lorenzo

Speakers of any (minoritised or majority) language sometimes make language mistakes. Bilingual speakers may use a hybrid language, mixing languages within a sentence or even within a word, especially when they are formally similar, as Spanish and Galician are. For minoritised languages, language errors may contribute to a negative perception towards the minoritised language. The Galician public broadcaster Televisión de Galicia (TVG) has received criticism for not being a high-quality language model, permitting the intrusion of language mistakes in its content. From an exclusively linguistic viewpoint, these errors should be corrected in subtitling. Conversely, subtitling guides and target users favour a verbatim rendition of the audio, in which oral language mistakes should not be corrected. Dialectal features, even if they are not considered errors, are non-standard language. This paper aims at answering the question of “to correct or not to correct” oral errors and dialectal features in the case of minoritised languages. It presents the most relevant data from a literature review, and an analysis of subtitling guidelines & standards and of current practices at TVG. These results have yielded an original protocol for the correction or reproduction of oral errors according to speech control, target audience and broadcast genre, the effect of a mistake, and the type of language error (vocabulary vs. grammar).


Author(s):  
XIAOLONG WANG ◽  
DANIEL S. YEUNG ◽  
JAMES N. K. LIU ◽  
ROBERT LUK ◽  
XUAN WANG

Language modeling is a current research topic in many domains including speech recognition, optical character recognition, handwriting recognition, machine translation and spelling correction. There are two main types of language models, the mathematical and the linguistic. The most widely used mathematical language model is the n-gram model inferred from statistics. This model has three problems: long distance restriction, recursive nature and partial language understanding. Language models based on linguistics present many difficulties when applied to large scale real texts. We present here a new hybrid language model that combines the advantages of the n-gram statistical language model with those of a linguistic language model which makes use of grammatical or semantic rules. Using suitable rules, this hybrid model can solve problems such as long distance restriction, recursive nature and partial language understanding. The new language model has been effective in experiments and has been incorporated in Chinese sentence input products for Windows and Macintosh OS.


2020 ◽  
Vol 63 (11) ◽  
pp. 3855-3864
Author(s):  
Wanting Huang ◽  
Lena L. N. Wong ◽  
Fei Chen ◽  
Haihong Liu ◽  
Wei Liang

Purpose Fundamental frequency (F0) is the primary acoustic cue for lexical tone perception in tonal languages but is processed in a limited way in cochlear implant (CI) systems. The aim of this study was to evaluate the importance of F0 contours in sentence recognition in Mandarin-speaking children with CIs and find out whether it is similar to/different from that in age-matched normal-hearing (NH) peers. Method Age-appropriate sentences, with F0 contours manipulated to be either natural or flattened, were randomly presented to preschool children with CIs and their age-matched peers with NH under three test conditions: in quiet, in white noise, and with competing sentences at 0 dB signal-to-noise ratio. Results The neutralization of F0 contours resulted in a significant reduction in sentence recognition. While this was seen only in noise conditions among NH children, it was observed throughout all test conditions among children with CIs. Moreover, the F0 contour-induced accuracy reduction ratios (i.e., the reduction in sentence recognition resulting from the neutralization of F0 contours compared to the normal F0 condition) were significantly greater in children with CIs than in NH children in all test conditions. Conclusions F0 contours play a major role in sentence recognition in both quiet and noise among pediatric implantees, and the contribution of the F0 contour is even more salient than that in age-matched NH children. These results also suggest that there may be differences between children with CIs and NH children in how F0 contours are processed.


Sign in / Sign up

Export Citation Format

Share Document