Heart disease prediction system based on hidden naïve bayes classifier

Author(s):  
M. A. Jabbar ◽  
Shirina Samreen
2015 ◽  
Vol 50 (4) ◽  
pp. 293-296 ◽  
Author(s):  
D Chaki ◽  
A Das ◽  
MI Zaber

The classification of heart disease patients is of great importance in cardiovascular disease diagnosis. Numerous data mining techniques have been used so far by the researchers to aid health care professionals in the diagnosis of heart disease. For this task, many algorithms have been proposed in the previous few years. In this paper, we have studied different supervised machine learning techniques for classification of heart disease data and have performed a procedural comparison of these. We have used the C4.5 decision tree classifier, a naïve Bayes classifier, and a Support Vector Machine (SVM) classifier over a large set of heart disease data. The data used in this study is the Cleveland Clinic Foundation Heart Disease Data Set available at UCI Machine Learning Repository. We have found that SVM outperformed both naïve Bayes and C4.5 classifier, giving the best accuracy rate of correctly classifying highest number of instances. We have also found naïve Bayes classifier achieved a competitive performance though the assumption of normality of the data is strongly violated.Bangladesh J. Sci. Ind. Res. 50(4), 293-296, 2015


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Anunchai Assawamakin ◽  
Supakit Prueksaaroon ◽  
Supasak Kulawonganunchai ◽  
Philip James Shaw ◽  
Vara Varavithya ◽  
...  

Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omicsdatasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.


2020 ◽  
Vol 8 (5) ◽  
pp. 4105-4110

In the current scenario, the researchers are focusing towards health care project for the prediction of the disease and its type. In addition to the prediction, there exists a need to find the influencing parameter that directly related to the disease prediction. The analysis of the parameters needed to the prediction of the disease still remains a challenging issue. With this view, we focus on predicting the heart disease by applying the dataset with boosting the parameters of the dataset. The heart disease data set extracted from UCI Machine Learning Repository is used for implementation. The anaconda Navigator IDE along with Spyder is used for implementing the Python code. Our contribution is folded is folded in three ways. First, the data preprocessing is done and the attribute relationship is identified by the correlation values. Second, the data set is fitted to random boost regressor and the important features are identified. Third, the dataset is feature scaled reduced and then fitted to random forest classifier, decision tree classifier, Naïve bayes classifier, logistic regression classifier, kernel support vector machine and KNN classifier. Fourth, the dataset is reduced with principal component analysis with five components and then fitted to the above mentioned classifiers. Fifth, the performance of the classifiers is analyzed with the metrics like accuracy, recall, fscore and precision. Experimental results shows that, the Naïve bayes classifier is more effective with the precision, Recall and Fscore of 0.89 without random boost, 0.88 with random boosting and 0.90 with principal component analysis. Experimental results show, the Naïve bayes classifier is more effective with the accuracy of 89% without random boost, 90% with random boosting and 91% with principal component analysis.


2020 ◽  
Vol 4 (2) ◽  
pp. 437 ◽  
Author(s):  
Dito Putro Utomo ◽  
Mesran Mesran

Heart disease is a disease with a high mortality rate, there are 12 million deaths each year worldwide. This is what causes the need for early diagnosis to find out the heart disease. But the process of diagnosis is quite challenging because of the complex relationship between the attributes of heart disease. So it is important to know the main attributes that are used as a decision making process or the classification process in heart disease. In this study the dataset used has 57 types of attributes in it. So that reduction is needed to shorten the diagnostic process, the reduction process can be carried out using the Principal Component Analysis (PCA) method. The PCA method itself can be combined with data mining calcification techniques to measure the accuracy of the dataset. This study compares the accuracy rate using the C5.0 algorithm and the Naïve Bayes Classifier (NBC) algorithm, the results obtained both after and before the reduction are Naïve Bayes Classifier (NBC) algorithms that have better performance than the C5.0 algorithm


Sign in / Sign up

Export Citation Format

Share Document