Pengenalan Karakter Tulisan Tangan Dengan K-Support Vector Nearest Neighbor

Aditya Surya Wijaya; Nurul Chamidah; Mayanda Mega Santoni

doi:10.22146/ijeis.38729

Pengenalan Karakter Tulisan Tangan Dengan K-Support Vector Nearest Neighbor

IJEIS (Indonesian Journal of Electronics and Instrumentation Systems) ◽

10.22146/ijeis.38729 ◽

2019 ◽

Vol 9 (1) ◽

pp. 33

Author(s):

Aditya Surya Wijaya ◽

Nurul Chamidah ◽

Mayanda Mega Santoni

Keyword(s):

Character Recognition ◽

Nearest Neighbor ◽

Recognition Accuracy ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Zone Method ◽

Handwritten Character ◽

Testing Data ◽

Handwritten Recognition

Handwritten characters are difficult to be recognized by machine because people had various own writing style. This research recognizes handwritten character pattern of numbers and alphabet using K-Nearest Neighbour (KNN) algorithm. Handwritten recognition process is worked by preprocessing handwritten image, segmentation to obtain separate single characters, feature extraction, and classification. Features extraction is done by utilizing Zone method that will be used for classification by splitting this features data to training data and testing data. Training data from extracted features reduced by K-Support Vector Nearest Neighbor (K-SVNN) and for recognizing handwritten pattern from testing data, we used K-Nearest Neighbor (KNN). Testing result shows that reducing training data using K-SVNN able to improve handwritten character recognition accuracy.

Download Full-text

Android Malware Detection using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1011.0982s1219 ◽

2020 ◽

Vol 8 (2S12) ◽

pp. 65-70

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

User Interest ◽

Android Malware ◽

Android Malware Detection

Machine Learning is empowering many aspects of day-to-day lives from filtering the content on social networks to suggestions of products that we may be looking for. This technology focuses on taking objects as image input to find new observations or show items based on user interest. The major discussion here is the Machine Learning techniques where we use supervised learning where the computer learns by the input data/training data and predict result based on experience. We also discuss the machine learning algorithms: Naïve Bayes Classifier, K-Nearest Neighbor, Random Forest, Decision Tress, Boosted Trees, Support Vector Machine, and use these classifiers on a dataset Malgenome and Drebin which are the Android Malware Dataset. Android is an operating system that is gaining popularity these days and with a rise in demand of these devices the rise in Android Malware. The traditional techniques methods which were used to detect malware was unable to detect unknown applications. We have run this dataset on different machine learning classifiers and have recorded the results. The experiment result provides a comparative analysis that is based on performance, accuracy, and cost.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

Identification of Leukemia Subtypes from Microscopic Images Using Convolutional Neural Network

Diagnostics ◽

10.3390/diagnostics9030104 ◽

2019 ◽

Vol 9 (3) ◽

pp. 104 ◽

Cited By ~ 11

Author(s):

Ahmed ◽

Yigit ◽

Isik ◽

Alpkocak

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set ◽

Leukemia Data

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.

Download Full-text

BASIC HANDWRITTEN CHARACTER RECOGNITION FROM MULTI-LINGUAL IMAGE DATASET USING MULTI-RESOLUTION AND MULTI-DIRECTIONAL TRANSFORM

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691312500464 ◽

2012 ◽

Vol 10 (05) ◽

pp. 1250046 ◽

Cited By ~ 8

Author(s):

SHITALA PRASAD ◽

GYANENDRA K. VERMA ◽

BHUPESH KUMAR SINGH ◽

PIYUSH KUMAR

Keyword(s):

Character Recognition ◽

Nearest Neighbor ◽

Morphological Alteration ◽

Support Vector ◽

Svm Classifier ◽

K Nearest Neighbor ◽

Novel Approach ◽

Pros And Cons ◽

Resolution Algorithm ◽

Level Fusion

This paper, proposes a novel approach for feature extraction based on the segmentation and morphological alteration of handwritten multi-lingual characters. We explored multi-resolution and multi-directional transforms such as wavelet, curvelet and ridgelet transform to extract classifying features of handwritten multi-lingual images. Evaluating the pros and cons of each multi-resolution algorithm has been discussed and resolved that Curvelet-based features extraction is most promising for multi-lingual character recognition. We have also applied some morphological operation such as thinning and thickening then feature level fusion is performed in order to create robust feature vector for classification. The classification is performed with K-nearest neighbor (K-NN) and support vector machine (SVM) classifier with their relative performance. We experiment with our in-house dataset, compiled in our lab by more than 50 personnel.

Download Full-text

Helmet Monitoring System using Hough Circle and HOG based on KNN

Lontar Komputer Jurnal Ilmiah Teknologi Informasi ◽

10.24843/lkjiti.2021.v12.i01.p02 ◽

2021 ◽

Vol 12 (1) ◽

pp. 13

Author(s):

Rachmad Jibril Al Kautsar ◽

Fitri Utaminingrum ◽

Agung Setia Budi

Keyword(s):

Police Officers ◽

Nearest Neighbor ◽

Training Data ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Histogram Of Oriented Gradient ◽

Surveillance Camera ◽

Testing Data ◽

Motorized Vehicles ◽

Traffic Operation

Indonesian citizens who use motorized vehicles are increasing every year. Every motorcyclist in Indonesia must wear a helmet when riding a motorcycle. Even though there are rules that require motorbike riders to wear helmets, there are still many motorists who disobey the rules. To overcome this, police officers have carried out various operations (such as traffic operation, warning, etc.). This is not effective because of the number of police officers available, and the probability of police officers make a mistake when detecting violations that might be caused due to fatigue. This study asks the system to detect motorcyclists who do not wear helmets through a surveillance camera. Referring to this reason, the Circular Hough Transform (CHT), Histogram of Oriented Gradient (HOG), and K-Nearest Neighbor (KNN) are used. Testing was done by using images taken from surveillance cameras divided into 200 training data and 40 testing data obtained an accuracy rate of 82.5%.

Download Full-text

Cell morphology-based machine learning models for human cell state classification

npj Systems Biology and Applications ◽

10.1038/s41540-021-00180-y ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Yi Li ◽

Chance M. Nowak ◽

Uyen Pham ◽

Khai Nguyen ◽

Leonidas Bleris

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

Annexin V ◽

Classification Performance ◽

Training Data ◽

Support Vector ◽

Apoptotic Cells ◽

K Nearest Neighbor

AbstractHerein, we implement and access machine learning architectures to ascertain models that differentiate healthy from apoptotic cells using exclusively forward (FSC) and side (SSC) scatter flow cytometry information. To generate training data, colorectal cancer HCT116 cells were subjected to miR-34a treatment and then classified using a conventional Annexin V/propidium iodide (PI)-staining assay. The apoptotic cells were defined as Annexin V-positive cells, which include early and late apoptotic cells, necrotic cells, as well as other dying or dead cells. In addition to fluorescent signal, we collected cell size and granularity information from the FSC and SSC parameters. Both parameters are subdivided into area, height, and width, thus providing a total of six numerical features that informed and trained our models. A collection of logistical regression, random forest, k-nearest neighbor, multilayer perceptron, and support vector machine was trained and tested for classification performance in predicting cell states using only the six aforementioned numerical features. Out of 1046 candidate models, a multilayer perceptron was chosen with 0.91 live precision, 0.93 live recall, 0.92 live f value and 0.97 live area under the ROC curve when applied on standardized data. We discuss and highlight differences in classifier performance and compare the results to the standard practice of forward and side scatter gating, typically performed to select cells based on size and/or complexity. We demonstrate that our model, a ready-to-use module for any flow cytometry-based analysis, can provide automated, reliable, and stain-free classification of healthy and apoptotic cells using exclusively size and granularity information.

Download Full-text

PREDIKSI KELULUSAN MAHASISWA MAGISTER TEKNIK INFORMATIKA UNIVERSITAS AMIKOM YOGYAKARTA MENGGUNAKAN METODE K-NEAREST NEIGHBOR

Respati ◽

10.35842/jtir.v13i2.260 ◽

2018 ◽

Vol 13 (2) ◽

Author(s):

Eri Sasmita Susanto ◽

Kusrini Kusrini ◽

Hanif Al Fatta

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

Training Data ◽

K Nearest Neighbor ◽

Process Data ◽

K Nearest Neighbors ◽

Testing Data ◽

Estimation Scheme ◽

Student Graduation ◽

Feasibility Test

INTISARIPenelitian ini difokuskan untuk mengetahui uji kelayakan prediksi kelulusan mahasiswa Universitas AMIKOM Yogyakarta. Dalam hal ini penulis memilih algoritma K-Nearest Neighbors (K-NN) karena K-Nearest Neighbors (K-NN) merupakan algoritma yang bisa digunakan untuk mengolah data yang bersifat numerik dan tidak membutuhkan skema estimasi parameter perulangan yang rumit, ini berarti bisa diaplikasikan untuk dataset berukuran besar.Input dari sistem ini adalah Data sampel berupa data mahasiswa tahun 2014-2015. pengujian pada penelitian ini menggunakn dua pengujian yaitu data testing dan data training. Kriteria yang digunakan dalam penelitian ini adalah , IP Semester 1-4, capaian SKS, Status Kelulusan. Output dari sistem ini berupa hasil prediksi kelulusan mahasiswa yang terbagi menjadi dua yaitu tepat waktu dan kelulusan tidak tepat waktu.Hasil pengujian menunjukkan bahwa Berdasarkan penerapan k=14 dan k-fold=5 menghasilkan performa yang terbaik dalam memprediksi kelulusan mahasiswa dengan metode K-Nearest Neighbor menggunakan indeks prestasi 4 semester dengan nilai akurasi= 98,46%, precision= 99.53% dan recall =97.64%.Kata kunci: Algoritma K-Nearest Neighbors, Prediksi Kelulusan, Data Testing, Data Training ABSTRACTThis research is focused on knowing the feasibility test of students' graduation prediction of AMIKOM University Yogyakarta. In this case the authors chose the K-Nearest Neighbors (K-NN) algorithm because K-Nearest Neighbors (K-NN) is an algorithm that can be used to process data that is numerical and does not require complicated repetitive parameter estimation scheme, this means it can be applied for large datasets.The input of this system is the sample data in the form of student data from 2014-2015. test in this research use two test that is data testing and training data. The criteria used in this study are, IP Semester 1-4, achievement of SKS, Graduation Status. The output of this system in the form of predicted results of student graduation which is divided into two that is timely and graduation is not timely.The result of the test shows that based on the application of k = 14 and k-fold = 5, the best performance in predicting the students' graduation using K-Nearest Neighbor method uses 4 semester achievement index with accuracy value = 98,46%, precision = 99.53% and recall = 97.64%.Keywords: K-Nearest Neighbors Algorithm, Graduation Prediction, Testing Data, Training Data

Download Full-text

DETERMINING LECTURAL EVALUATION IN FACULTY OF ENGINEERING MALIKUSSALEH UNIVERSITY USING K-NN

TECHSI - Jurnal Teknik Informatika ◽

10.29103/techsi.v11i2.1613 ◽

2019 ◽

Vol 11 (2) ◽

pp. 307

Author(s):

Asrianda Asrianda ◽

Risawandi Risawandi ◽

Gunarwan Gunarwan

Keyword(s):

Nearest Neighbor ◽

Predictor Variable ◽

Training Data ◽

Manhattan Distance ◽

Similar Data ◽

K Nearest Neighbor ◽

Classification Result ◽

Distance Method ◽

Testing Data ◽

Supervised Learning Algorithms

K-Nearest Neighbor is a method that can classify data based on the closest distance. In addition, K-NN is one of the supervised learning algorithms with learning processes based on the value of the target variable associated with the value of the predictor variable. In the K-NN algorithm, all data must have a label, so that when a new data is given, the data will be compared with the existing data, then the most similar data is taken by looking at the label of that data. Filling and processing many questionnaires to determining the results of lectural evaluation from the performance of lecturers certainly requires a lot of time and process. Therefore, it is necessary to apply the K-NN Manhattan Distance method. In this study, the testing data is taken from one of the training data and has a classification result that is "Very Good". After going through the K-NN Manhattan Distance method with k being the closest / smallest neighbor, then the following results are obtained: Distance 5.4, the classification result is "Very Good" and 74.03% of similarity value. Based on the results obtained, the result of the classification from K-NN Manhattan Distance method show similarities with the results of the pre-existing classification.

Download Full-text

Classification of Bird Based on Face Types Using Gray Level Co-Occurrence Matrix (GLCM) Feature Extraction Based on the k-Nearest Neighbor (K-NN) Algorithm

Journal of Applied Intelligent System ◽

10.33633/jais.v6i2.4627 ◽

2021 ◽

Vol 6 (2) ◽

pp. 111-119

Author(s):

Daurat Sinaga ◽

Feri Agustina ◽

Noor Ageng Setiyanto ◽

Suprayogi Suprayogi ◽

Cahaya Jatmoko

Keyword(s):

Feature Extraction ◽

Nearest Neighbor ◽

Correlation Energy ◽

Training Data ◽

Gray Level ◽

K Nearest Neighbor ◽

Testing Data ◽

Occurrence Matrix

Indonesia is one of the countries with a large number of fauna wealth. Various types of fauna that exist are scattered throughout Indonesia. One type of fauna that is owned is a type of bird animal. Birds are often bred as pets because of their characteristic facial voice and body features. In this study, using the Gray Level Co-Occurrence Matrix (GLCM) based on the k-Nearest Neighbor (K-NN) algorithm. The data used in this study were 66 images which were divided into two, namely 55 training data and 11 testing data. The calculation of the feature value used in this study is based on the value of the GLCM feature extraction such as: contrast, correlation, energy, homogeneity and entropy which will later be calculated using the k-Nearest Neighbor (K-NN) algorithm and Eucliden Distance. From the results of the classification process using k-Nearest Neighbor (K-NN), it is found that the highest accuracy results lie at the value of K = 1 and at an degree of 0 ° of 54.54%.

Download Full-text