Yarmouk Arabic OCR Dataset

<p><span>An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the Arabic Optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output. </span></p>

Download Full-text

A P2P GRID ARCHITECTURE FOR DISTRIBUTED ARABIC OCR BASED ON THE DTW ALGORITHM

International Journal of Computers and Applications ◽

10.2316/journal.202.2009.1.202-2587 ◽

2009 ◽

Vol 31 (1) ◽

Cited By ~ 1

Author(s):

M. Khemakhem ◽

A. Belghith

Keyword(s):

Grid Architecture ◽

Arabic Ocr

Download Full-text

Arabic OCR error correction using character segment correction, language modeling, and shallow morphology

Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing - EMNLP '06 ◽

10.3115/1610075.1610132 ◽

2006 ◽

Cited By ~ 15

Author(s):

Walid Magdy ◽

Kareem Darwish

Keyword(s):

Error Correction ◽

Language Modeling ◽

Arabic Ocr ◽

Character Segment

Download Full-text

Arabic Optical Character Recognition

Applied Signal and Image Processing ◽

10.4018/978-1-60960-477-6.ch019 ◽

2011 ◽

pp. 324-346 ◽

Cited By ~ 1

Author(s):

Husni Al-Muhtaseb ◽

Rami Qahwaji

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Arabic Language ◽

Text Recognition ◽

Text Segmentation ◽

Future Trends ◽

Optical Character ◽

Arabic Ocr ◽

Processing Techniques ◽

Arabic Speaking

Arabic text recognition is receiving more attentions from both Arabic and non-Arabic-speaking researchers. This chapter provides a general overview of the state-of-the-art in Arabic Optical Character Recognition (OCR) and the associated text recognition technology. It also investigates the characteristics of the Arabic language with respect to OCR and discusses related research on the different phases of text recognition including: pre-processing and text segmentation, common feature extraction techniques, classification methods and post-processing techniques. Moreover, the chapter discusses the available databases for Arabic OCR research and lists the available commercial Software. Finally, it explores the challenges related to Arabic OCR and discusses possible future trends.

Download Full-text

Selecting most efficient Arabic OCR features extraction methods using Key Performance Indicators

CCCA12 ◽

10.1109/ccca.2012.6417861 ◽

2012 ◽

Cited By ~ 1

Author(s):

Reem Kabbani

Keyword(s):

Performance Indicators ◽

Key Performance Indicators ◽

Extraction Methods ◽

Features Extraction ◽

Arabic Ocr

Download Full-text