scholarly journals Pengenalan Karakter Tulisan Tangan Dengan K-Support Vector Nearest Neighbor

Author(s):  
Aditya Surya Wijaya ◽  
Nurul Chamidah ◽  
Mayanda Mega Santoni

Handwritten characters are difficult to be recognized by machine because people had various own writing style. This research recognizes handwritten character pattern of numbers and alphabet using K-Nearest Neighbour (KNN) algorithm. Handwritten recognition process is worked by preprocessing handwritten image, segmentation to obtain separate single characters, feature extraction, and classification. Features extraction is done by utilizing Zone method that will be used for classification by splitting this features data to training data and testing data. Training data from extracted features reduced by K-Support Vector Nearest Neighbor (K-SVNN) and for recognizing handwritten pattern from testing data, we used K-Nearest Neighbor (KNN). Testing result shows that reducing training data using K-SVNN able to improve handwritten character recognition accuracy.

Machine Learning is empowering many aspects of day-to-day lives from filtering the content on social networks to suggestions of products that we may be looking for. This technology focuses on taking objects as image input to find new observations or show items based on user interest. The major discussion here is the Machine Learning techniques where we use supervised learning where the computer learns by the input data/training data and predict result based on experience. We also discuss the machine learning algorithms: Naïve Bayes Classifier, K-Nearest Neighbor, Random Forest, Decision Tress, Boosted Trees, Support Vector Machine, and use these classifiers on a dataset Malgenome and Drebin which are the Android Malware Dataset. Android is an operating system that is gaining popularity these days and with a rise in demand of these devices the rise in Android Malware. The traditional techniques methods which were used to detect malware was unable to detect unknown applications. We have run this dataset on different machine learning classifiers and have recorded the results. The experiment result provides a comparative analysis that is based on performance, accuracy, and cost.


2021 ◽  
Vol 87 (6) ◽  
pp. 445-455
Author(s):  
Yi Ma ◽  
Zezhong Zheng ◽  
Yutang Ma ◽  
Mingcang Zhu ◽  
Ran Huang ◽  
...  

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.


Diagnostics ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 104 ◽  
Author(s):  
Ahmed ◽  
Yigit ◽  
Isik ◽  
Alpkocak

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.


Author(s):  
SHITALA PRASAD ◽  
GYANENDRA K. VERMA ◽  
BHUPESH KUMAR SINGH ◽  
PIYUSH KUMAR

This paper, proposes a novel approach for feature extraction based on the segmentation and morphological alteration of handwritten multi-lingual characters. We explored multi-resolution and multi-directional transforms such as wavelet, curvelet and ridgelet transform to extract classifying features of handwritten multi-lingual images. Evaluating the pros and cons of each multi-resolution algorithm has been discussed and resolved that Curvelet-based features extraction is most promising for multi-lingual character recognition. We have also applied some morphological operation such as thinning and thickening then feature level fusion is performed in order to create robust feature vector for classification. The classification is performed with K-nearest neighbor (K-NN) and support vector machine (SVM) classifier with their relative performance. We experiment with our in-house dataset, compiled in our lab by more than 50 personnel.


2021 ◽  
Vol 12 (1) ◽  
pp. 13
Author(s):  
Rachmad Jibril Al Kautsar ◽  
Fitri Utaminingrum ◽  
Agung Setia Budi

 Indonesian citizens who use motorized vehicles are increasing every year. Every motorcyclist in Indonesia must wear a helmet when riding a motorcycle. Even though there are rules that require motorbike riders to wear helmets, there are still many motorists who disobey the rules. To overcome this, police officers have carried out various operations (such as traffic operation, warning, etc.). This is not effective because of the number of police officers available, and the probability of police officers make a mistake when detecting violations that might be caused due to fatigue. This study asks the system to detect motorcyclists who do not wear helmets through a surveillance camera. Referring to this reason, the Circular Hough Transform (CHT), Histogram of Oriented Gradient (HOG), and K-Nearest Neighbor (KNN) are used. Testing was done by using images taken from surveillance cameras divided into 200 training data and 40 testing data obtained an accuracy rate of 82.5%.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Yi Li ◽  
Chance M. Nowak ◽  
Uyen Pham ◽  
Khai Nguyen ◽  
Leonidas Bleris

AbstractHerein, we implement and access machine learning architectures to ascertain models that differentiate healthy from apoptotic cells using exclusively forward (FSC) and side (SSC) scatter flow cytometry information. To generate training data, colorectal cancer HCT116 cells were subjected to miR-34a treatment and then classified using a conventional Annexin V/propidium iodide (PI)-staining assay. The apoptotic cells were defined as Annexin V-positive cells, which include early and late apoptotic cells, necrotic cells, as well as other dying or dead cells. In addition to fluorescent signal, we collected cell size and granularity information from the FSC and SSC parameters. Both parameters are subdivided into area, height, and width, thus providing a total of six numerical features that informed and trained our models. A collection of logistical regression, random forest, k-nearest neighbor, multilayer perceptron, and support vector machine was trained and tested for classification performance in predicting cell states using only the six aforementioned numerical features. Out of 1046 candidate models, a multilayer perceptron was chosen with 0.91 live precision, 0.93 live recall, 0.92 live f value and 0.97 live area under the ROC curve when applied on standardized data. We discuss and highlight differences in classifier performance and compare the results to the standard practice of forward and side scatter gating, typically performed to select cells based on size and/or complexity. We demonstrate that our model, a ready-to-use module for any flow cytometry-based analysis, can provide automated, reliable, and stain-free classification of healthy and apoptotic cells using exclusively size and granularity information.


Respati ◽  
2018 ◽  
Vol 13 (2) ◽  
Author(s):  
Eri Sasmita Susanto ◽  
Kusrini Kusrini ◽  
Hanif Al Fatta

INTISARIPenelitian ini difokuskan untuk mengetahui uji kelayakan prediksi kelulusan mahasiswa Universitas AMIKOM Yogyakarta. Dalam hal ini penulis memilih algoritma K-Nearest Neighbors (K-NN) karena K-Nearest Neighbors (K-NN) merupakan algoritma  yang bisa digunakan untuk mengolah data yang bersifat numerik dan tidak membutuhkan skema estimasi parameter perulangan yang rumit, ini berarti bisa diaplikasikan untuk dataset berukuran besar.Input dari sistem ini adalah Data sampel berupa data mahasiswa tahun 2014-2015. pengujian pada penelitian ini menggunakn dua pengujian yaitu data testing dan data training. Kriteria yang digunakan dalam penelitian ini adalah , IP Semester 1-4, capaian SKS, Status Kelulusan. Output dari sistem ini berupa hasil prediksi kelulusan mahasiswa yang terbagi menjadi dua yaitu tepat waktu dan kelulusan tidak tepat waktu.Hasil pengujian menunjukkan bahwa Berdasarkan penerapan k=14 dan k-fold=5 menghasilkan performa yang terbaik dalam memprediksi kelulusan mahasiswa dengan metode K-Nearest Neighbor menggunakan indeks prestasi 4 semester dengan nilai akurasi= 98,46%, precision= 99.53% dan recall =97.64%.Kata kunci: Algoritma K-Nearest Neighbors, Prediksi Kelulusan, Data Testing, Data Training ABSTRACTThis research is focused on knowing the feasibility test of students' graduation prediction of AMIKOM University Yogyakarta. In this case the authors chose the K-Nearest Neighbors (K-NN) algorithm because K-Nearest Neighbors (K-NN) is an algorithm that can be used to process data that is numerical and does not require complicated repetitive parameter estimation scheme, this means it can be applied for large datasets.The input of this system is the sample data in the form of student data from 2014-2015. test in this research use two test that is data testing and training data. The criteria used in this study are, IP Semester 1-4, achievement of SKS, Graduation Status. The output of this system in the form of predicted results of student graduation which is divided into two that is timely and graduation is not timely.The result of the test shows that based on the application of k = 14 and k-fold = 5, the best performance in predicting the students' graduation using K-Nearest Neighbor method uses 4 semester achievement index with accuracy value = 98,46%, precision = 99.53% and recall = 97.64%.Keywords: K-Nearest Neighbors Algorithm, Graduation Prediction, Testing Data, Training Data


2019 ◽  
Vol 11 (2) ◽  
pp. 307
Author(s):  
Asrianda Asrianda ◽  
Risawandi Risawandi ◽  
Gunarwan Gunarwan

K-Nearest Neighbor is a method that can classify data based on the closest distance. In addition, K-NN is one of the supervised learning algorithms with learning processes based on the value of the target variable associated with the value of the predictor variable. In the K-NN algorithm, all data must have a label, so that when a new data is given, the data will be compared with the existing data, then the most similar data is taken by looking at the label of that data. Filling and processing many questionnaires to determining the results of lectural evaluation from the performance of lecturers certainly requires a lot of time and process. Therefore, it is necessary to apply the K-NN Manhattan Distance method. In this study, the testing data is taken from one of the training data and has a classification result that is "Very Good". After going through the K-NN Manhattan Distance method with k being the closest / smallest neighbor, then the following results are obtained: Distance 5.4, the classification result is "Very Good" and 74.03% of similarity value. Based on the results obtained, the result of the classification from K-NN Manhattan Distance method show similarities with the results of the pre-existing classification.


2021 ◽  
Vol 6 (2) ◽  
pp. 111-119
Author(s):  
Daurat Sinaga ◽  
Feri Agustina ◽  
Noor Ageng Setiyanto ◽  
Suprayogi Suprayogi ◽  
Cahaya Jatmoko

Indonesia is one of the countries with a large number of fauna wealth. Various types of fauna that exist are scattered throughout Indonesia. One type of fauna that is owned is a type of bird animal. Birds are often bred as pets because of their characteristic facial voice and body features. In this study, using the Gray Level Co-Occurrence Matrix (GLCM) based on the k-Nearest Neighbor (K-NN) algorithm. The data used in this study were 66 images which were divided into two, namely 55 training data and 11 testing data. The calculation of the feature value used in this study is based on the value of the GLCM feature extraction such as: contrast, correlation, energy, homogeneity and entropy which will later be calculated using the k-Nearest Neighbor (K-NN) algorithm and Eucliden Distance. From the results of the classification process using k-Nearest Neighbor (K-NN), it is found that the highest accuracy results lie at the value of K = 1 and at an degree of 0 ° of 54.54%.


Sign in / Sign up

Export Citation Format

Share Document