scholarly journals General Models for Handwritten Text Recognition: Feasibility and State-of-the Art. German Kurrent as an Example

2021 ◽  
Vol 7 ◽  
Author(s):  
Tobias Hodel ◽  
David Schoch ◽  
Christa Schneider ◽  
Jake Purcell
Author(s):  
Mohamed Elleuch ◽  
Monji Kherallah

In recent years, deep learning (DL) based systems have become very popular for constructing hierarchical representations from unlabeled data. Moreover, DL approaches have been shown to exceed foregoing state of the art machine learning models in various areas, by pattern recognition being one of the more important cases. This paper applies Convolutional Deep Belief Networks (CDBN) to textual image data containing Arabic handwritten script (AHS) and evaluated it on two different databases characterized by the low/high-dimension property. In addition to the benefits provided by deep networks, the system is protected against over-fitting. Experimentally, the authors demonstrated that the extracted features are effective for handwritten character recognition and show very good performance comparable to the state of the art on handwritten text recognition. Yet using Dropout, the proposed CDBN architectures achieved a promising accuracy rates of 91.55% and 98.86% when applied to IFN/ENIT and HACDB databases, respectively.


2021 ◽  
Vol 7 (12) ◽  
pp. 260
Author(s):  
Lazaros Tsochatzidis ◽  
Symeon Symeonidis ◽  
Alexandros Papazoglou ◽  
Ioannis Pratikakis

Offline handwritten text recognition (HTR) for historical documents aims for effective transcription by addressing challenges that originate from the low quality of manuscripts under study as well as from several particularities which are related to the historical period of writing. In this paper, the challenge in HTR is related to a focused goal of the transcription of Greek historical manuscripts that contain several particularities. To this end, in this paper, a convolutional recurrent neural network architecture is proposed that comprises octave convolution and recurrent units which use effective gated mechanisms. The proposed architecture has been evaluated on three newly created collections from Greek historical handwritten documents that will be made publicly available for research purposes as well as on standard datasets like IAM and RIMES. For evaluation we perform a concise study which shows that compared to state of the art architectures, the proposed one deals effectively with the challenging Greek historical manuscripts.


2020 ◽  
Vol 10 (21) ◽  
pp. 7711
Author(s):  
Arthur Flor de Sousa Neto ◽  
Byron Leite Dantas Bezerra ◽  
Alejandro Héctor Toselli

The increasing portability of physical manuscripts to the digital environment makes it common for systems to offer automatic mechanisms for offline Handwritten Text Recognition (HTR). However, several scenarios and writing variations bring challenges in recognition accuracy, and, to minimize this problem, optical models can be used with language models to assist in decoding text. Thus, with the aim of improving results, dictionaries of characters and words are generated from the dataset and linguistic restrictions are created in the recognition process. In this way, this work proposes the use of spelling correction techniques for text post-processing to achieve better results and eliminate the linguistic dependence between the optical model and the decoding stage. In addition, an encoder–decoder neural network architecture in conjunction with a training methodology are developed and presented to achieve the goal of spelling correction. To demonstrate the effectiveness of this new approach, we conducted an experiment on five datasets of text lines, widely known in the field of HTR, three state-of-the-art Optical Models for text recognition and eight spelling correction techniques, among traditional statistics and current approaches of neural networks in the field of Natural Language Processing (NLP). Finally, our proposed spelling correction model is analyzed statistically through HTR system metrics, reaching an average sentence correction of 54% higher than the state-of-the-art method of decoding in the tested datasets.


Author(s):  
Sri. Yugandhar Manchala ◽  
Jayaram Kinthali ◽  
Kowshik Kotha ◽  
Kanithi Santosh Kumar, Jagilinki Jayalaxmi ◽  

2021 ◽  
Author(s):  
Ayan Kumar Bhunia ◽  
Shuvozit Ghose ◽  
Amandeep Kumar ◽  
Pinaki Nath Chowdhury ◽  
Aneeshan Sain ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document