A graph-based segmentation and feature extraction framework for Arabic text recognition

Author(s):  
A.M. Elgammal ◽  
M.A. Ismail
IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 18569-18584
Author(s):  
Najoua Rahal ◽  
Maroua Tounsi ◽  
Amir Hussain ◽  
Adel M. Alimi
Keyword(s):  

Author(s):  
HUMOUD B. AL-SADOUN ◽  
ADNAN AMIN

This paper proposes a new structural technique for Arabic text recognition. The technique can be divided into five major steps: (1) preprocessing and binarization; (2) thinning; (3) binary tree construction; (4) segmentation; and (5) recognition. The advantage of this technique is that its execution does not depend on either the font or size of character. Thus, this same technique might be utilized for the recognition of machine or hand printed text. The relevant algorithm is implemented on a microcomputer. Experiments were conducted to verify the accuracy and the speed of this algorithm using about 20,000 subwords each with an average length of 3 characters. The subwords used were written using different fonts. The recognition rate obtained in the experiments indicated an accuracy of 93.38 % with a speed of 2.7 characters per second.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Mohammad S. Khorsheed

Feature extraction plays an important role in text recognition as it aims to capture essential characteristics of the text image. Feature extraction algorithms widely range between robust and hard to extract features and noise sensitive and easy to extract features. Among those feature types are statistical features which are derived from the statistical distribution of the image pixels. This paper presents a novel method for feature extraction where simple statistical features are extracted from a one-pixel wide window that slides across the text line. The feature set is clustered in the feature space using vector quantization. The feature vector sequence is then injected to a classification engine for training and recognition purposes. The recognition system is applied to a data corpus which includes cursive Arabic text of more than 600 A4-size sheets typewritten in multiple computer-generated fonts. The system performance is compared to a previously published system from the literature with a similar engine but a different feature set.


Sign in / Sign up

Export Citation Format

Share Document