A Cache Language Model for Whole Document Handwriting Recognition

Author(s):  
Volkmar Frinken ◽  
Dimosthenis Karatzas ◽  
Andreas Fischer
Author(s):  
Christopher Tensmeyer ◽  
Curtis Wigington ◽  
Brian Davis ◽  
Seth Stewart ◽  
Tony Martinez ◽  
...  

Author(s):  
U.-V. MARTI ◽  
H. BUNKE

In this paper, a system for the reading of totally unconstrained handwritten text is presented. The kernel of the system is a hidden Markov model (HMM) for handwriting recognition. This HMM is enhanced by a statistical language model. Thus linguistic knowledge beyond the lexicon level is incorporated in the recognition process. Another novel feature of the system is that the HMM is applied in such a way that the difficult problem of segmenting a line of text into individual words is avoided. A number of experiments with various language models and large vocabularies have been conducted. The language models used in the system were also analytically compared based on their perplexity.


Author(s):  
MARCUS LIWICKI ◽  
HORST BUNKE

This paper presents a system for the recognition of online whiteboard notes. Notes written on a whiteboard is a new modality in handwriting recognition research that has received relatively little attention in the past. For the recognition we use an offline HMM-recognizer, which is supplemented with methods for processing the online data and generating offline images. The system consists of six main modules: online preprocessing, transformation of online to offline data, offline preprocessing, feature extraction, classification and post-processing. The recognition rate of our basic recognizer in a writer independent experiment is 59.5%. By applying state-of-the-art methods, such as optimizing the number of states and Gaussian components, and by including a language model we could achieve a statistically significant increase of the recognition rate to 64.3%. To further improve the system performance we increased the size of the training set. For that we investigated two different strategies. First, we used another existing database of offline handwritten text. Second, we used a recently collected whiteboard database, called the IAM-OnDB. By means of these strategies the recognition rate could be further increased up to 68.5%.


Sign in / Sign up

Export Citation Format

Share Document