A Deep Learning Technology based OCR Framework for Recognition Handwritten Expression and Text

CONVERTER ◽

10.17762/converter.259 ◽

2021 ◽

pp. 01-10

Author(s):

Tuanji Gong, Xuanxia Yao

Keyword(s):

Deep Learning ◽

Online Education ◽

Character Recognition ◽

Optical Character Recognition ◽

Mathematical Expression ◽

Text Recognition ◽

Learning Technology ◽

Expression Recognition ◽

Features Selection ◽

Handwritten Text Recognition

Recently Optical character recognition (OCR) based on deep learning technology has achieved great advance and broadly applied in various industries. However it still faces many challenging problems in handwritten text recognition and mathematical expression recognition, such as handwritten Chinese recognition, mixture of printed and handwritten Chinese characters, mathematical expression (ME), chemical equations. In traditional OCR, features selection played a vital role for recognition accuracy, while hand-crafted features are costly and time-consuming. In this paper, we introduce a deep learning based framework to detect and recognize handwritten and printed text or math expression. The framework consists of three components. The first component is DCN (Detection & Classification Network), which based on SSD model to detects and classify mathematical expression and text. The second component consists of text recognition and ME recognition models. The final component merges multiple outputs of the second stage into a whole text. Experiment results show that our framework achieves a relative 10% improvement in mixture of texts and MEs which are printed or handwritten in images. The framework has been deployed for recognition paper or homework at one online education platform.

Download Full-text

CNN-Based Page Segmentation and Object Classification for Counting Population in Ottoman Archival Documentation

Journal of Imaging ◽

10.3390/jimaging6050032 ◽

2020 ◽

Vol 6 (5) ◽

pp. 32 ◽

Cited By ~ 1

Author(s):

Yekta Said Can ◽

M. Erdem Kabadayı

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Text Recognition ◽

Historical Documents ◽

Layout Analysis ◽

Page Segmentation ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Different Types ◽

Archival Documentation

Historical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) methods, which increase the importance of the page segmentation and layout analysis. Degradation of documents, digitization errors, and varying layout styles are the issues that complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics, and different writing styles make it even more challenging to process Arabic script historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s and 1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places.

Download Full-text

Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing Image-Based Captcha

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-900 ◽

2021 ◽

pp. 266-269

Author(s):

Janarthanan A ◽

Pandiyarajan C ◽

Sabarinathan M ◽

Sudhan M ◽

Kala R

Keyword(s):

Deep Learning ◽

Image Classification ◽

Character Recognition ◽

Optical Character Recognition ◽

Experimental Results ◽

Text Recognition ◽

Image Resizing ◽

Optical Character ◽

Learning Techniques ◽

Text Images

Optical character recognition (OCR) is a process of text recognition in images (one word). The input images are taken from the dataset. The collected text images are implemented to pre-processing. In pre-processing, we can implement the image resize process. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see clear content. After that, we can implement the segmentation process. In segmentation, we can segment the each characters in one word. We can extract the features values from the image that means test feature. In classification process, we have to classify the text from the image. Image classification is performed the images in order to identify which image contains text. A classifier is used to identify the image containing text. The experimental results shows that the accuracy.

Download Full-text

A Simplified Research for Mathematical Expression Recognition and Its Conversion to Speech

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1008.0882s819 ◽

2019 ◽

Vol 8 (2S8) ◽

pp. 1033-1038

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Visually Impaired ◽

Recognition Rate ◽

Mathematical Expression ◽

Expression Recognition ◽

Advanced Mathematics ◽

Data Set ◽

Visually Impaired People ◽

Impaired People

The number of visually impaired people appearing for various examination is increasing every year while on the other hand, there are several blind aspirants who are willing to enrich their knowledge through higher studies. Mathematics is one of the key language (subject) for those who are willing to pursue higher studies in science stream. There is a lot of advanced Braille techniques and OCR to speech conversion software's made available to help visual impaired community to pursue their education but still the number of visually impaired students getting admitted to higher education is less. This is not because most of the data is on paper in the form of books and documents. So, there is a great need to convert information from the physical domain into the digital domain which would help the visually impaired people to read the advanced mathematics text independently. Optical Character Recognition (OCR) systems for mathematics have received considerable attention in recent years due to the tremendous need for the digitization of printed documents. Existing literature reveals that, most of the works concentrated on recognizing handwritten mathematical symbols and some works revolve around complex algorithms. This paper proposes a simple, yet efficient approach to develop an OCR system for mathematics and its conversion to speech. For Mathematical symbol recognition, Skin and Bone algorithm is proposed, which proved its efficiency on a variety of data set. The proposed methodology has been tested on 50 equations comprising various symbols such as integral, differential, square, square root and currently achieving recognition rate of 92%.

Download Full-text

One-Word Answer Correction using Deep Learning Models and OCR

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3849.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 679-682

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Educational Organizations ◽

Learning Models ◽

Image Captioning ◽

New Approach ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Text Features

Examinations/Assessments are a way to assess the understanding of a student on a particular subject. Even today many educational organizations prefer to conduct exams by offline mode (pen and paper). And evaluating them is a timeconsuming process. There is no effectual model to evaluate Offline descriptive answers automatically. The traditional method involves staff assessing the content manually. In place of this process, a new approach using image captioning by using deep learning algorithms can be implemented. Handwritten Text Recognition (HTR) can be used to evaluate descriptive answers. One-word Answers captured as images are pre-processed to extract the text features using deep learning models and pytesseract. This paper presents a comparison between the CNNRNN hybrid model and Optical Character Recognition (OCR) to predict a score for one-word answers.

Download Full-text

Label Transcript is Done – Now what do we do with that Data?

Biodiversity Information Science and Standards ◽

10.3897/biss.2.27055 ◽

2018 ◽

Vol 2 ◽

pp. e27055

Author(s):

Robert Cubey ◽

Elspeth Haston ◽

Sally King

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Linked Data ◽

Data Stream ◽

Text Recognition ◽

Botanic Garden ◽

Optical Character ◽

Natural History Collection ◽

Handwritten Text ◽

Handwritten Text Recognition

The transcription of natural history collection labels is occurring via a variety of different methods – in-house curators, commercial operations, citizen scientists, visiting researchers, linked data, optical character recognition (OCR), handwritten text recognition (HTR), etc., but what can a collections data manager do with this flood of data? There are a whole raft of questions around this incoming data stream - who values it, who needs it, where is it stored, where is it displayed, who has access to it, etc. This talk plans to address these topics with reference to the Royal Botanic Garden Edinburgh herbarium dataset.

Download Full-text

The Comparison of Deep Learning Driven Optical Character Recognition for Hard Disk Head Slider Serial Number

2020 International Conference on Power, Energy and Innovations (ICPEI) ◽

10.1109/icpei49860.2020.9431431 ◽

2020 ◽

Author(s):

Palakorn Imsamer ◽

Vorachat Boonyaphon ◽

Somporn Tiacharoen

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Hard Disk ◽

Head Slider ◽

Optical Character ◽

Serial Number

Download Full-text

Offline Handwritten Text Recognition Using Deep Learning: A Review

Journal of Physics Conference Series ◽

10.1088/1742-6596/1848/1/012015 ◽

2021 ◽

Vol 1848 (1) ◽

pp. 012015

Author(s):

Yintong Wang ◽

Wenjie Xiao ◽

Shuo Li

Keyword(s):

Deep Learning ◽

Text Recognition ◽

Handwritten Text ◽

Handwritten Text Recognition

Download Full-text

Handwritten Text Recognition using Deep Learning with TensorFlow

International Journal of Engineering Research and ◽

10.17577/ijertv9is050534 ◽

2020 ◽

Vol V9 (05) ◽

Author(s):

Sri. Yugandhar Manchala ◽

Jayaram Kinthali ◽

Kowshik Kotha ◽

Kanithi Santosh Kumar, Jagilinki Jayalaxmi ◽

Keyword(s):

Deep Learning ◽

Text Recognition ◽

Handwritten Text ◽

Handwritten Text Recognition

Download Full-text

SCENE TEXT RECOGNITION BY USING EE-MSER AND OPTICAL CHARACTER RECOGNITION FOR NATURAL IMAGES

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.021219 ◽

2015 ◽

Vol 2 (12) ◽

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Natural Images ◽

Text Recognition ◽

Optical Character ◽

Scene Text ◽

Scene Text Recognition

Download Full-text

Deep Learning for Optical Character Recognition and Its Application to VAT Invoice Recognition

Lecture Notes in Electrical Engineering - Communications, Signal Processing, and Systems ◽

10.1007/978-981-13-6508-9_12 ◽

2019 ◽

pp. 87-95 ◽

Cited By ~ 1

Author(s):

Yu Wang ◽

Guan Gui ◽

Nan Zhao ◽

Yue Yin ◽

Hao Huang ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Optical Character

Download Full-text