Application of Genetic Algorithm and K-Nearest Neighbour Method in Real World Medical Fraud Detection Problem

Author(s):  
Hongxing He ◽  
◽  
Simon Hawkins ◽  
Warwick Graco ◽  
Xin Yao ◽  
...  

In the k-Nearest Neighbour (kNN) algorithm, the classification of a new sample is determined by the class of its k nearest neighbours. The performance of the kNN algorithm is influenced by three main factors: (1) the distance metric used to locate the nearest neighbours; (2) the decision rule used to derive a classification from the k-nearest neighbours; and (3) the number of neighbours used to classify the new sample. Using k = 1, 3, or 5 nearest neighbours, this study uses a Genetic Algorithm (GA) to find the optimal non-Euclidean distance metric in the kNN algorithm and examines two alternative methods (Majority Rule and Bayes Rule) to derive a classification from the k nearest neighbours. This modified algorithm was evaluated on two real-world medical fraud problems. The General Practitioner (GP) database is a 2-class problem in which GPs are classified as either practising appropriately or inappropriately. The ’.Doctor-Shoppers’ database is a 5-class problem in which patients are classified according to the likelihood that they are ’doctor-shoppers’. Doctor-shoppers are patients who consult many physicians in order to obtain multiple prescriptions of drugs of addiction in excess of their own therapeutic need. In both applications, classification accuracy was improved by optimising the distance metric in the kNN algorithm. The agreement rate on the GP dataset improved from around 70% (using Euclidean distance) to 78 % (using an optimised distance metric), and from about 55% to 82% on the Doctor Shopper’s dataset. Differences in either the decision rule or the number of nearest neighbours had little or no impact on the classification performance of the kNN algorithm. The excellent performance of the kNN algorithm when the distance metric is optimised using a genetic algorithm paves the way for its application in the real world fraud detection problems faced by the Health Insurance Commission (HIC).

2019 ◽  
Vol 11 (1) ◽  
pp. 16-19
Author(s):  
Felix Indra Kurniadi ◽  
Vinnia Kemala Putri

White blood cells, have a function to protect human body from viruses, bacteria and any other harmful substance. In this research, Local Binary Pattern was proposed for feature extraction using Euclidean distance, Chebyshev distance and Minkowski distance as classifier.


1996 ◽  
Vol 11 (3) ◽  
pp. 245-252
Author(s):  
W. Z. Liu

AbstractThe basic nearest neighbour algorithm works by storing the training instances and classifying a new case by predicting that it has the same class as its nearest stored instance. To measure the distance between instances, some distance metric needs to be used. In situations when all attributes have numeric values, the conventional nearest neighbour method treats examples as points in feature spaces and uses Euclidean distance as the distance metric. In tasks with only nominal attributes, the simple “over-lap” metric is usually used. To handle classification tasks that have mixed types of attributes, the two different metrics are simply combined. Work by researchers in the machine learning field has shown that this approach performs poorly. This paper attempts to study a more recently developed distance metric and show that this metric is capable of measuring the importance of different attributes. With the use of discretisation for numeric-valued attributes, this method provides an integrated way in dealing with problem domains with mixtures of attribute types. Through detailed analyses, this paper tries to provide further insights into the understanding of nearest neighbour classification techniques and promote further use of this type of classification algorithm.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Kirsi Varpa ◽  
Kati Iltanen ◽  
Martti Juhola

Genetic algorithms have been utilized in many complex optimization and simulation tasks because of their powerful search method. In this research we studied whether the classification performance of the attribute weighted methods based on the nearest neighbour search can be improved when using the genetic algorithm in the evolution of attribute weighting. The attribute weights in the starting population were based on the weights set by the application area experts and machine learning methods instead of random weight setting. The genetic algorithm improved the total classification accuracy and the median true positive rate of the attribute weighted k-nearest neighbour method using neighbour’s class-based attribute weighting. With other methods, the changes after genetic algorithm were moderate.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Silvia Zaoli ◽  
Piero Mazzarisi ◽  
Fabrizio Lillo

AbstractBetweenness centrality quantifies the importance of a vertex for the information flow in a network. The standard betweenness centrality applies to static single-layer networks, but many real world networks are both dynamic and made of several layers. We propose a definition of betweenness centrality for temporal multiplexes. This definition accounts for the topological and temporal structure and for the duration of paths in the determination of the shortest paths. We propose an algorithm to compute the new metric using a mapping to a static graph. We apply the metric to a dataset of $$\sim 20$$ ∼ 20 k European flights and compare the results with those obtained with static or single-layer metrics. The differences in the airports rankings highlight the importance of considering the temporal multiplex structure and an appropriate distance metric.


2015 ◽  
Vol 21 (S4) ◽  
pp. 218-223 ◽  
Author(s):  
D. Dowsett

AbstractTwo techniques for use with SIMION [1] are presented, boundary matching and genetic optimization. The first allows systems which were previously difficult or impossible to simulate in SIMION to be simulated with great accuracy. The second allows any system to be rapidly and robustly optimized using a parallelized genetic algorithm. Each method will be described along with examples of real world applications.


Author(s):  
Brian Murphy ◽  
Geraldine B. Boylan ◽  
Gordon Lightbody ◽  
William P. Marnane

Sign in / Sign up

Export Citation Format

Share Document