Automatic CNN-Based Arabic Numeral Spotting and Handwritten Digit Recognition by Using Deep Transfer Learning in Ottoman Population Registers

Yekta Said Can; M. Erdem Kabadayı

doi:10.3390/app10165430

Automatic CNN-Based Arabic Numeral Spotting and Handwritten Digit Recognition by Using Deep Transfer Learning in Ottoman Population Registers

Applied Sciences ◽

10.3390/app10165430 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5430 ◽

Cited By ~ 2

Author(s):

Yekta Said Can ◽

M. Erdem Kabadayı

Keyword(s):

Transfer Learning ◽

High Performance ◽

Arabic Numeral ◽

Color Filter ◽

Historical Documents ◽

Historical Inquiry ◽

Page Segmentation ◽

Digit Recognition ◽

Handwritten Text Recognition ◽

Handwritten Digit

Historical manuscripts and archival documentation are handwritten texts which are the backbone sources for historical inquiry. Recent developments in the digital humanities field and the need for extracting information from the historical documents have fastened the digitization processes. Cutting edge machine learning methods are applied to extract meaning from these documents. Page segmentation (layout analysis), keyword, number and symbol spotting, handwritten text recognition algorithms are tested on historical documents. For most of the languages, these techniques are widely studied and high performance techniques are developed. However, the properties of Arabic scripts (i.e., diacritics, varying script styles, diacritics, and ligatures) create additional problems for these algorithms and, therefore, the number of research is limited. In this research, we first automatically spotted the Arabic numerals from the very first series of population registers of the Ottoman Empire conducted in the mid-nineteenth century and recognized these numbers. They are important because they held information about the number of households, registered individuals and ages of individuals. We applied a red color filter to separate numerals from the document by taking advantage of the structure of the studied registers (numerals are written in red). We first used a CNN-based segmentation method for spotting these numerals. In the second part, we annotated a local Arabic handwritten digit dataset from the spotted numerals by selecting uni-digit ones and tested the Deep Transfer Learning method from large open Arabic handwritten digit datasets for digit recognition. We achieved promising results for recognizing digits in these historical documents.

Download Full-text

High Performance Classifiers Combination for Handwritten Digit Recognition

Pattern Recognition and Data Mining - Lecture Notes in Computer Science ◽

10.1007/11551188_68 ◽

2005 ◽

pp. 619-626 ◽

Cited By ~ 4

Author(s):

Hubert Cecotti ◽

Szilárd Vajda ◽

Abdel Belaïd

Keyword(s):

High Performance ◽

Handwritten Digit Recognition ◽

Digit Recognition ◽

Handwritten Digit ◽

Classifiers Combination

Download Full-text

CNN-Based Page Segmentation and Object Classification for Counting Population in Ottoman Archival Documentation

Journal of Imaging ◽

10.3390/jimaging6050032 ◽

2020 ◽

Vol 6 (5) ◽

pp. 32 ◽

Cited By ~ 1

Author(s):

Yekta Said Can ◽

M. Erdem Kabadayı

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Text Recognition ◽

Historical Documents ◽

Layout Analysis ◽

Page Segmentation ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Different Types ◽

Archival Documentation

Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) methods, which increase the importance of the page segmentation and layout analysis. Degradation of documents, digitization errors, and varying layout styles are the issues that complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics, and different writing styles make it even more challenging to process Arabic script historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s and 1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places.

Download Full-text

Ensemble deep transfer learning model for Arabic (Indian) handwritten digit recognition

Neural Computing and Applications ◽

10.1007/s00521-021-06423-7 ◽

2021 ◽

Author(s):

Rami S. Alkhawaldeh ◽

Moatsum Alawida ◽

Nawaf Farhan Funkur Alshdaifat ◽

Wafa’ Za’al Alma’aitah ◽

Ammar Almasri

Keyword(s):

Transfer Learning ◽

Learning Model ◽

Handwritten Digit Recognition ◽

Digit Recognition ◽

Handwritten Digit

Download Full-text

Handwritten Text Recognition Using Machine Learning

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-2006 ◽

2021 ◽

pp. 106-108

Author(s):

Yojana Swapneel Samant

Keyword(s):

Machine Learning ◽

Object Identification ◽

Advanced Technology ◽

Text Recognition ◽

Digit Recognition ◽

Digital Format ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Handwritten Digit ◽

The Given

The human race has shown a huge interest in machines over the years and has developed and advanced to a very large extent in this domain. Starting from the object identification and classification through pictures to editing for the captured image or video everything can be performed through machines and advanced systems, one such part of this advanced technology is deep learning or machine learning. which comes with its own individual set of modules, algorithms, and techniques. Similar to this domain one such idea which was discovered is handwritten digit recognition. This is one of such areas where development and research occur around the numerical also known as digits, where a number of possibilities, permutations, and combinations are attained to gain accurate results this can also be perceived as the ability of computers to interpret and understand the given input which is through number plates, photographs, documents or can be in a handwritten format and to convert it in digital format as an output through screens.

Download Full-text

Performance Analysis On Bangla Handwritten Digit Recognition Using CNN And Transfer Learning

International Journal of Advanced Networking Applications ◽

10.35444/ijana.2021.13101 ◽

2021 ◽

Vol 13 (01) ◽

pp. 4809-4815

Author(s):

Afsana Hossain ◽

Md. Sabbir Hasan ◽

Md. Mujtaba Asif ◽

Amit Kumar Das

Keyword(s):

Performance Analysis ◽

Transfer Learning ◽

Handwritten Digit Recognition ◽

Digit Recognition ◽

Handwritten Digit

Download Full-text

Arabic Handwritten Digit Recognition using Convolutional Neural Network

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f7745.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1187-1190

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Handwriting Recognition ◽

Data Entry ◽

Arabic Numeral ◽

Experimental Result ◽

Digit Recognition ◽

Average Accuracy ◽

Handwritten Digit ◽

Arab League

Arabic is the most widely used language in the world, especially the Arab League Country. Of course, in those countries often use Arabic numeral in banks and business applications, postal zip code and data entry application. This research has focused on handwriting recognition of Arabic numeral that has unlimited variation in human handwriting such as style and shape. The proposed method on the deep learning technique is Convolutional Neural Network. LeNet-5 architect also used in training and recognizing the handwritten image of Arabic numeral as much as 70000 images derived from MADbase dataset. The experimental result on 10000 images of database used is by comparing the number of epoch in training process yields, and the average accuracy is 97.67%.

Download Full-text