scholarly journals A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Mathematics ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 286 ◽  
Author(s):  
Hamid Saadatfar ◽  
Samiyeh Khosravi ◽  
Javad Hassannataj Joloudari ◽  
Amir Mosavi ◽  
Shahaboddin Shamshirband

The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to make classification methods applicable on large datasets is pruning. LC-KNN is an improved KNN method which first clusters the data into some smaller partitions using the K-means clustering method; and then applies the KNN for each new sample on the partition which its center is the nearest one. However, because the clusters have different shapes and densities, selection of the appropriate cluster is a challenge. In this paper, an approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account these factors. The proposed approach helps to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy. The performance of the proposed approach is evaluated on different real datasets. The experimental results show the effectiveness of the proposed approach and its higher classification accuracy and lower time cost in comparison to other recent relevant methods.

2018 ◽  
Vol 14 (9) ◽  
pp. 1213-1225 ◽  
Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Author(s):  
Ahmed.T. Sahlol ◽  
Aboul Ella Hassanien

There are still many obstacles for achieving high recognition accuracy for Arabic handwritten optical character recognition system, each character has a different shape, as well as the similarities between characters. In this chapter, several feature selection-based bio-inspired optimization algorithms including Bat Algorithm, Grey Wolf Optimization, Whale optimization Algorithm, Particle Swarm Optimization and Genetic Algorithm have been presented and an application of Arabic handwritten characters recognition has been chosen to see their ability and accuracy to recognize Arabic characters. The experiments have been performed using a benchmark dataset, CENPARMI by k-Nearest neighbors, Linear Discriminant Analysis, and random forests. The achieved results show superior results for the selected features when comparing the classification accuracy for the selected features by the optimization algorithms with the whole feature set in terms of the classification accuracy and the processing time. The experiments have been performed using a benchmark dataset, CENPARMI by k-Nearest neighbors, Linear Discriminant Analysis, and random forests. The achieved results show superior results for the selected features when comparing the classification accuracy for the selected features by the optimization algorithms with the whole feature set in terms of the classification accuracy and the processing time.


2020 ◽  
Vol 28 (5) ◽  
pp. 874-886 ◽  
Author(s):  
Jesus Maillo ◽  
Salvador Garcia ◽  
Julian Luengo ◽  
Francisco Herrera ◽  
Isaac Triguero

2019 ◽  
Vol 16 (10) ◽  
pp. 4425-4430 ◽  
Author(s):  
Devendra Prasad ◽  
Sandip Kumar Goyal ◽  
Avinash Sharma ◽  
Amit Bindal ◽  
Virendra Singh Kushwah

Machine Learning is a growing area in computer science in today’s era. This article is focusing on prediction analysis using K-Nearest Neighbors (KNN) Machine Learning algorithm. Data in the dataset are processed, analyzed and predicated using the specified algorithm. Introduction of various Machine Learning algorithms, its pros and cons have been discussed. The KNN algorithm with detail study is given and it is implemented on the specified data with certain parameters. The research work elucidates prediction analysis and explicates the prediction of quality of restaurants.


TEM Journal ◽  
2021 ◽  
pp. 1385-1389
Author(s):  
Phong Thanh Nguyen

Machine Learning is a subset and technology developed in the field of Artificial Intelligence (AI). One of the most widely used machine learning algorithms is the K-Nearest Neighbors (KNN) approach because it is a supervised learning algorithm. This paper applied the K-Nearest Neighbors (KNN) algorithm to predict the construction price index based on Vietnam's socio-economic variables. The data to build the prediction model was from the period 2016 to 2019 based on seven socio-economic variables that impact the construction price index (i.e., industrial production, construction investment capital, Vietnam’s stock price index, consumer price index, foreign exchange rate, total exports, and imports). The research results showed that the construction price index prediction model based on the K-Nearest Neighbors (KNN) regression method has fewer errors than the traditional method.


2019 ◽  
Vol 12 (4) ◽  
pp. 72
Author(s):  
Sara Alomari ◽  
Salha Abdullah

Concept maps have been used to assist learners as an effective learning method in identifying relationships between information, especially when teaching materials have many topics or concepts. However, making a manual concept map is a long and tedious task. It is time-consuming and demands an intensive effort in reading the full content and reasoning the relationships among concepts. Due to this inefficiency, many studies are carried out to develop intelligent algorithms using several data mining techniques. In this research, the authors aim at improving Text Analysis-Association Rules Mining (TA-ARM) algorithm using the weighted K-nearest neighbors (KNN) algorithm instead of the traditional KNN. The weighted KNN is expected to optimize the classification accuracy, which will, eventually, enhance the quality of the generated concept map.


Sign in / Sign up

Export Citation Format

Share Document