scholarly journals Benchmark Pashto Handwritten Character Dataset and Pashto Object Character Recognition (OCR) Using Deep Neural Network with Rule Activation Function

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Imran Uddin ◽  
Dzati A. Ramli ◽  
Abdullah Khan ◽  
Javed Iqbal Bangash ◽  
Nosheen Fayyaz ◽  
...  

In the area of machine learning, different techniques are used to train machines and perform different tasks like computer vision, data analysis, natural language processing, and speech recognition. Computer vision is one of the main branches where machine learning and deep learning techniques are being applied. Optical character recognition (OCR) is the ability of a machine to recognize the character of a language. Pashto is one of the most ancient and historical languages of the world, spoken in Afghanistan and Pakistan. OCR application has been developed for various cursive languages like Urdu, Chinese, and Japanese, but very little work is done for the recognition of the Pashto language. When it comes to handwritten character recognition, it becomes more difficult for OCR to recognize the characters as every handwritten character’s shape is influenced by the writer’s hand motion dynamics. The reason for the lack of research in Pashto handwritten character data as compared to other languages is because there is no benchmark dataset available for experimental purposes. This study focuses on the creation of such a dataset, and then for the evaluation purpose, a machine is trained to correctly recognize unseen Pashto handwritten characters. To achieve this objective, a dataset of 43000 images was created. Three Feed Forward Neural Network models with backpropagation algorithm using different Rectified Linear Unit (ReLU) layer configurations (Model 1 with 1-ReLU Layer, Model 2 with 2-ReLU layers, and Model 3 with 3-ReLU Layers) were trained and tested with this dataset. The simulation shows that Model 1 achieved accuracy up to 87.6% on unseen data while Model 2 achieved an accuracy of 81.60% and 3% accuracy, respectively. Similarly, loss (cross-entropy) was the lowest for Model 1 with 0.15 and 3.17 for training and testing, followed by Model 2 with 0.7 and 4.2 for training and testing, while Model 3 was the last with loss values of 6.4 and 3.69. The precision, recall, and f-measure values of Model 1 were better than those of both Model 2 and Model 3. Based on results, Model 1 (with 1 ReLU activation layer) is found to be the most efficient as compared to the other two models in terms of accuracy to recognize Pashto handwritten characters.

Author(s):  
Karthikeyan P. ◽  
Karunakaran Velswamy ◽  
Pon Harshavardhanan ◽  
Rajagopal R. ◽  
JeyaKrishnan V. ◽  
...  

Machine learning is the part of artificial intelligence that makes machines learn without being expressly programmed. Machine learning application built the modern world. Machine learning techniques are mainly classified into three techniques: supervised, unsupervised, and semi-supervised. Machine learning is an interdisciplinary field, which can be joined in different areas including science, business, and research. Supervised techniques are applied in agriculture, email spam, malware filtering, online fraud detection, optical character recognition, natural language processing, and face detection. Unsupervised techniques are applied in market segmentation and sentiment analysis and anomaly detection. Deep learning is being utilized in sound, image, video, time series, and text. This chapter covers applications of various machine learning techniques, social media, agriculture, and task scheduling in a distributed system.


2020 ◽  
pp. 1-22 ◽  
Author(s):  
D. Sykes ◽  
A. Grivas ◽  
C. Grover ◽  
R. Tobin ◽  
C. Sudlow ◽  
...  

Abstract Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.


Author(s):  
Yaseen Khather Yaseen ◽  
Alaa Khudhair Abbas ◽  
Ahmed M. Sana

Today, images are a part of communication between people. However, images are being used to share information by hiding and embedding messages within it, and images that are received through social media or emails can contain harmful content that users are not able to see and therefore not aware of. This paper presents a model for detecting spam on images. The model is a combination of optical character recognition, natural language processing, and the machine learning algorithm. Optical character recognition extracts the text from images, and natural language processing uses linguistics capabilities to detect and classify the language, to distinguish between normal text and slang language. The features for selected images are then extracted using the bag-of-words model, and the machine learning algorithm is run to detect any kind of spam that may be on it. Finally, the model can predict whether or not the image contains any harmful content. The results show that the proposed method using a combination of the machine learning algorithm, optical character recognition, and natural language processing provides high detection accuracy compared to using machine learning alone.


2019 ◽  
Author(s):  
J. Christopher D. Terry ◽  
Helen E. Roy ◽  
Tom A. August

AbstractThe accurate identification of species in images submitted by citizen scientists is currently a bottleneck for many data uses. Machine learning tools offer the potential to provide rapid, objective and scalable species identification for the benefit of many aspects of ecological science. Currently, most approaches only make use of image pixel data for classification. However, an experienced naturalist would also use a wide variety of contextual information such as the location and date of recording.Here, we examine the automated identification of ladybird (Coccinellidae) records from the British Isles submitted to the UK Ladybird Survey, a volunteer-led mass participation recording scheme. Each image is associated with metadata; a date, location and recorder ID, which can be cross-referenced with other data sources to determine local weather at the time of recording, habitat types and the experience of the observer. We built multi-input neural network models that synthesise metadata and images to identify records to species level.We show that machine learning models can effectively harness contextual information to improve the interpretation of images. Against an image-only baseline of 48.2%, we observe a 9.1 percentage-point improvement in top-1 accuracy with a multi-input model compared to only a 3.6% increase when using an ensemble of image and metadata models. This suggests that contextual data is being used to interpret an image, beyond just providing a prior expectation. We show that our neural network models appear to be utilising similar pieces of evidence as human naturalists to make identifications.Metadata is a key tool for human naturalists. We show it can also be harnessed by computer vision systems. Contextualisation offers considerable extra information, particularly for challenging species, even within small and relatively homogeneous areas such as the British Isles. Although complex relationships between disparate sources of information can be profitably interpreted by simple neural network architectures, there is likely considerable room for further progress. Contextualising images has the potential to lead to a step change in the accuracy of automated identification tools, with considerable benefits for large scale verification of submitted records.


Author(s):  
Abhishek Das ◽  
Mihir Narayan Mohanty

In this chapter, the authors have reviewed on optical character recognition. The study belongs to both typed characters and handwritten character recognition. Online and offline character recognition are two modes of data acquisition in the field of OCR and are also studied. As deep learning is the emerging machine learning method in the field of image processing, the authors have described the method and its application of earlier works. From the study of the recurrent neural network (RNN), a special class of deep neural network is proposed for the recognition purpose. Further, convolutional neural network (CNN) is combined with RNN to check its performance. For this piece of work, Odia numerals and characters are taken as input and well recognized. The efficacy of the proposed method is explained in the result section.


Author(s):  
Karthikeyan P. ◽  
Karunakaran Velswamy ◽  
Pon Harshavardhanan ◽  
Rajagopal R. ◽  
JeyaKrishnan V. ◽  
...  

Machine learning is the part of artificial intelligence that makes machines learn without being expressly programmed. Machine learning application built the modern world. Machine learning techniques are mainly classified into three techniques: supervised, unsupervised, and semi-supervised. Machine learning is an interdisciplinary field, which can be joined in different areas including science, business, and research. Supervised techniques are applied in agriculture, email spam, malware filtering, online fraud detection, optical character recognition, natural language processing, and face detection. Unsupervised techniques are applied in market segmentation and sentiment analysis and anomaly detection. Deep learning is being utilized in sound, image, video, time series, and text. This chapter covers applications of various machine learning techniques, social media, agriculture, and task scheduling in a distributed system.


Information ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 41
Author(s):  
Melinda Loubser ◽  
Martin J. Puttkammer

In this paper, the viability of neural network implementations of core technologies (the focus of this paper is on text technologies) for 10 resource-scarce South African languages is evaluated. Neural networks are increasingly being used in place of other machine learning methods for many natural language processing tasks with good results. However, in the South African context, where most languages are resource-scarce, very little research has been done on neural network implementations of core language technologies. In this paper, we address this gap by evaluating neural network implementations of four core technologies for ten South African languages. The technologies we address are part of speech tagging, named entity recognition, compound analysis and lemmatization. Neural architectures that performed well on similar tasks in other settings were implemented for each task and the performance was assessed in comparison with currently used machine learning implementations of each technology. The neural network models evaluated perform better than the baselines for compound analysis, are viable and comparable to the baseline on most languages for POS tagging and NER, and are viable, but not on par with the baseline, for Afrikaans lemmatization.


2020 ◽  
Vol 8 (6) ◽  
pp. 5815-5819

In day to day human life, handwritten documents are a general purpose for communication and restoring their information. In the field of computer science, character recognition using Deep Learning has more attention. DL has a massive set of pattern recognition tools that can apply to speech recognition, image processing, natural language processing and has a remarkable capability to find out a solution for complex machine learning problems. DL can focus on the specific feature of an image to character recognition for enhancing efficiency and accuracy. In this paper, we have presented a methods for handwritten character recognition using deep learning.


Education is fundamental for human progress. A student is evaluated by the mark he/she scores. The evaluation of student’s work is a central aspect of the teaching profession that can affect students in significant ways. Though teachers use multiple criteria for assessing student work, it is not known if emotions are a factor in their grading decisions. Also, there are several mistakes that occur on the department's side like totaling error, marking mistakes. So, we are developing software to automate the evaluation of answers using Natural Language Processing and Machine Learning. There are two modules, in the first module, we use Optical Character Recognition to extract a handwritten font from the uploaded file and the second module evaluates the answer based on various factors and the mark is awarded. For every answer being entered, evaluation is done based on the usage of word, their importance and grammatical meaning of the sentence. With this approach we can save the cost of checking the answers manually and reduce the workload of the teachers by automating the manual checking process. The evaluation time is also reduced by using this software.


Sign in / Sign up

Export Citation Format

Share Document