Molecular Subtypes Recognition of Breast Cancer in Dynamic Contrast-Enhanced Breast Magnetic Resonance Imaging Phenotypes from Radiomics Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2019/6978650 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Wei Li ◽

Kun Yu ◽

Chaolu Feng ◽

Dazhe Zhao

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Molecular Subtypes ◽

Recursive Feature Elimination ◽

Gradient Boosting ◽

Main Role ◽

Feature Subset ◽

Lesion Segmentation ◽

Decision Tree Classifier ◽

Tree Classifier

Background and Objective. Breast cancer is a major cause of mortality among women if not treated in early stages. Recognizing molecular markers from DCE-MRI directly to distinguish the four molecular subtypes without invasive biopsy is helpful for guiding treatment plans for breast cancer, which provides a fast way to consequential treatment plan decision in early time and best opportunity for patients. Methods. This study presents an approach of molecular subtypes recognition from breast cancer image phenotypes by radiomics. An improved region growth algorithm with dynamic threshold without user interaction is proposed for cancer lesion segmentation, which gives the precise border of lesion other than area with background. The lesions are extracted automatically based on radiologists’ annotation which guarantees the lesion is segmented correctly. Various features are extracted on lesions data including texture, morphology, dynamic kinetics, and statistics features carried out on a large patient cohort, which are used to validate the relationship between image phenotypes and the molecular subtypes. A new algorithm of multimodel-based recursive feature elimination is applied on the radiomics data generated by the feature extraction process. This method obtains the feature subset with stable performance for different classification models, and the gradient boosting decision tree model gets the best results of both classification performance and imbalance performance on molecular subtypes. Result. From the experimental results, 69 optimal features from 143 original features are found by the multimodel-based recursive feature elimination algorithms and the gradient boosting decision tree classifier obtains a good performance with accuracy 0.87, precise 0.88, recall 0.87, and F1-score 0.87. The dataset with 637 patients in this paper has serious imbalance problem on different molecular subtypes, and the the robust features that are generated by multimodel-based recursive feature eliminiation algorithm make the gradient boosting decision tree classifier have good behaviors. The recognition precision for the four molecular subtypes of luminal A, luminal B, HER-2, and basal-like are 0.91, 0.89, 0.83, and 0.87, respectively. Conclusions. The improved lesion segmentation method gives more precise lesion edge, which not only saves the time of automatic extraction of lesion region of interest without threshold setting for each case, but also prevents the segmentation error by manual and prejudice from different radiologists. The feature selection algorithm of multimodel-based recursive feature elimination has the ability to find robust and optimal features that distinguish the four molecular subtypes from image phenotypes. The gradient boosting decision tree classifier rather plays a main role in recognition than other models used in this paper.

Download Full-text

Breast Cancer Detection using Gradient Boost Ensemble Decision Tree Classifier

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3664.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 2169-2173

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Cancer Detection ◽

Gradient Boosting ◽

Decision Tree Classifier ◽

Feature Extracting ◽

Light Gradient ◽

Tree Classifier ◽

Classification Time ◽

Made In

Detection of any abnormalities in the human is a big challenge faced by many of the field experts. One such challenge is to detect the Breast Cancer. The prime mottobehind in making this paper is to detect the breast cancer with the help of breast images in an advanced and appropriate way. In this study, an attempt is made in such a way by applying the combination of various existing technics in the extracted breast images for getting better result in detecting the Breast Cancer. Consequently,feature extracting images are appliedusing Light gradient boosting ensemble decision tree classifier for identifying benign and malign features of an image. As a result, the normal and abnormal breast cancer image is detected by combining above applications. Besides, classification accuracy and minimize classification time metrics are also achieved more appropriately than the existing detectingtechnics.

Download Full-text

Comparative Study on Heart Disease Prediction Using Feature Selection Techniques on Classification Algorithms

Applied Computational Intelligence and Soft Computing ◽

10.1155/2021/5581806 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Kaushalya Dissanayake ◽

Md Gapar Md Johar

Keyword(s):

Feature Selection ◽

Heart Disease ◽

Decision Tree ◽

Recursive Feature Elimination ◽

Support Vector ◽

Classification Algorithms ◽

Feature Subset ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Feature Selection Techniques

Heart disease is recognized as one of the leading factors of death rate worldwide. Biomedical instruments and various systems in hospitals have massive quantities of clinical data. Therefore, understanding the data related to heart disease is very important to improve prediction accuracy. This article has conducted an experimental evaluation of the performance of models created using classification algorithms and relevant features selected using various feature selection approaches. For results of the exploratory analysis, ten feature selection techniques, i.e., ANOVA, Chi-square, mutual information, ReliefF, forward feature selection, backward feature selection, exhaustive feature selection, recursive feature elimination, Lasso regression, and Ridge regression, and six classification approaches, i.e., decision tree, random forest, support vector machine, K-nearest neighbor, logistic regression, and Gaussian naive Bayes, have been applied to Cleveland heart disease dataset. The feature subset selected by the backward feature selection technique has achieved the highest classification accuracy of 88.52%, precision of 91.30%, sensitivity of 80.76%, and f-measure of 85.71% with the decision tree classifier.

Download Full-text

PERFORMANCE ANALYSIS OF BREAST CANCER CLASSIFICATION USING DECISION TREE CLASSIFIERS

International Journal of Current Pharmaceutical Research ◽

10.22159/ijcpr.2017v9i2.17383 ◽

2017 ◽

Vol 9 (2) ◽

pp. 19 ◽

Cited By ~ 6

Author(s):

P. Hamsagayathri ◽

P. Sampath

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Ductal Carcinoma ◽

Research Work ◽

The United States ◽

Breast Cancer Dataset ◽

Decision Tree Classifier ◽

Cancer Dataset ◽

Term Survival ◽

Tree Classifier

Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Breast cancer dataset. This paper analyzes the different decision tree classifier algorithms for Wisconsin original, diagnostic and prognostic dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.

Download Full-text

Ensemble Decision Tree Classifier For Breast Cancer Data

International Journal of Information Technology Convergence and Services ◽

10.5121/ijitcs.2012.2103 ◽

2012 ◽

Vol 2 (1) ◽

pp. 17-24 ◽

Cited By ~ 37

Author(s):

D Lavanya

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Breast Cancer Data ◽

Decision Tree Classifier ◽

Cancer Data ◽

Tree Classifier

Download Full-text

A Comparative Study to Evaluate the Performance of Classification Algorithms in Mammogram Analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.6.14960 ◽

2018 ◽

Vol 7 (3.6) ◽

pp. 154

Author(s):

S K. Sajan ◽

M Germanus Alex

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Decision Tree ◽

Automated System ◽

Support Vector ◽

Classification Algorithms ◽

Neural Network Classifier ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Mammogram Image

Breast cancer is a major threat humans are facing irrespective of geographical limits. The awareness about breast cancer has increased during the last decade and many preventive measures were in practice to detect the breast cancer before the symptoms were felt. Mammography is a screening methodology currently in practice. In this paper the mammogram image is analyzed using automated system. The automated system is designed to be capable of distinguishing the mammogram image into a normal or malignant. This process involves image enhancement and image segmentation at preprocessing level. Histogram equalization technique is used to transform low contrast region of the mammogram into region with higher contrast and Fuzzy C Means (FCM) algorithm is used to segment the mammogram image into regions suitable for further analysis. After enhancement and segmentation at preprocessing level the classification is done using three classification algorithms like decision tree classifier, Neural Network classifier and Support Vector Machine (SVM). The performance of the classification algorithms is evaluated using the following criteria like speed, flexibility, robustness, scalability, interpretability, Time complexity and also based on accuracy, sensitivity and specificity. The results obtained in classification are compared with other classification algorithms. It is found that the neural network classifier approach produces better results compared to other classifiers.The average accuracy in diagnosis by Neural Network approach classifier is around 91%. Also it is found that the decision tree approach is much flexible and easy to use compared to other approaches.

Download Full-text

Swindling Shonky Anatomization of Credit Card Transactions using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7621.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 1477-1483

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Credit Card ◽

Naive Bayes ◽

Gradient Boosting ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Feature Importance

With the fast moving technological advancement, the internet usage has been increased rapidly in all the fields. The money transactions for all the applications like online shopping, banking transactions, bill settlement in any industries, online ticket booking for travel and hotels, Fees payment for educational organization, Payment for treatment to hospitals, Payment for super market and variety of applications are using online credit card transactions. This leads to the fraud usage of other accounts and transaction that result in the loss of service and profit to the institution. With this background, this paper focuses on predicting the fraudulent credit card transaction. The Credit Card Transaction dataset from KAGGLE machine learning Repository is used for prediction analysis. The analysis of fraudulent credit card transaction is achieved in four ways. Firstly, the relationship between the variables of the dataset is identified and represented by the graphical notations. Secondly, the feature importance of the dataset is identified using Random Forest, Ada boost, Logistic Regression, Decision Tree, Extra Tree, Gradient Boosting and Naive Bayes classifiers. Thirdly, the extracted feature importance if the credit card transaction dataset is fitted to Random Forest classifier, Ada boost classifier, Logistic Regression classifier, Decision Tree classifier, Extra Tree classifier, Gradient Boosting classifier and Naive Bayes classifier. Fourth, the Performance Analysis is done by analyzing the performance metrics like Accuracy, FScore, AUC Score, Precision and Recall. The implementation is done by python in Anaconda Spyder Navigator Integrated Development Environment. Experimental Results shows that the Decision Tree classifier have achieved the effective prediction with the precision of 1.0, recall of 1.0, FScore of 1.0 , AUC Score of 89.09 and Accuracy of 99.92%.

Download Full-text

Performance Evaluation of a Proposed Machine Learning Model for Chronic Disease Datasets Using an Integrated Attribute Evaluator and an Improved Decision Tree Classifier

Applied Sciences ◽

10.3390/app10228137 ◽

2020 ◽

Vol 10 (22) ◽

pp. 8137

Author(s):

Sushruta Mishra ◽

Pradeep Kumar Mallick ◽

Hrudaya Kumar Tripathy ◽

Akash Kumar Bhoi ◽

Alfonso González-Briones

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Heart Disease ◽

Chronic Disease ◽

Decision Tree ◽

Classification Performance ◽

Decision Tree Classifier ◽

Accuracy Rate ◽

Filter Methods ◽

Tree Classifier

There is a consistent rise in chronic diseases worldwide. These diseases decrease immunity and the quality of daily life. The treatment of these disorders is a challenging task for medical professionals. Dimensionality reduction techniques make it possible to handle big data samples, providing decision support in relation to chronic diseases. These datasets contain a series of symptoms that are used in disease prediction. The presence of redundant and irrelevant symptoms in the datasets should be identified and removed using feature selection techniques to improve classification accuracy. Therefore, the main contribution of this paper is a comparative analysis of the impact of wrapper and filter selection methods on classification performance. The filter methods that have been considered include the Correlation Feature Selection (CFS) method, the Information Gain (IG) method and the Chi-Square (CS) method. The wrapper methods that have been considered include the Best First Search (BFS) method, the Linear Forward Selection (LFS) method and the Greedy Step Wise Search (GSS) method. A Decision Tree algorithm has been used as a classifier for this analysis and is implemented through the WEKA tool. An attribute significance analysis has been performed on the diabetes, breast cancer and heart disease datasets used in the study. It was observed that the CFS method outperformed other filter methods concerning the accuracy rate and execution time. The accuracy rate using the CFS method on the datasets for heart disease, diabetes, breast cancer was 93.8%, 89.5% and 96.8% respectively. Moreover, latency delays of 1.08 s, 1.02 s and 1.01 s were noted using the same method for the respective datasets. Among wrapper methods, BFS’ performance was impressive in comparison to other methods. Maximum accuracy of 94.7%, 95.8% and 96.8% were achieved on the datasets for heart disease, diabetes and breast cancer respectively. Latency delays of 1.42 s, 1.44 s and 132 s were recorded using the same method for the respective datasets. On the basis of the obtained result, a new hybrid Attribute Evaluator method has been proposed which effectively integrates enhanced K-Means clustering with the CFS filter method and the BFS wrapper method. Furthermore, the hybrid method was evaluated with an improved decision tree classifier. The improved decision tree classifier combined clustering with classification. It was validated on 14 different chronic disease datasets and its performance was recorded. A very optimal and consistent classification performance was observed. The mean values for accuracy, specificity, sensitivity and f-score metrics were 96.7%, 96.5%, 95.6% and 96.2% respectively.

Download Full-text

DECISION TREE CLASSIFIERS FOR CLASSIFICATION OF BREAST CANCER

International Journal of Current Pharmaceutical Research ◽

10.22159/ijcpr.2017v9i1.17377 ◽

2017 ◽

Vol 9 (2) ◽

pp. 31 ◽

Cited By ~ 4

Author(s):

P. Hamsagayathri ◽

P. Sampath

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Ductal Carcinoma ◽

Kappa Statistic ◽

Breast Cancer Dataset ◽

Decision Tree Classifier ◽

Cancer Dataset ◽

Term Survival ◽

Time To Build ◽

Tree Classifier

Objective: Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. Survey: According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in united states. Also, 246,660 new cases of women with cancer are estimated for the year 2016.Methods: Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification is one of the vital techniques used by researchers to analyze and classify the medical data.Results: This paper analyzes the different decision tree classifier algorithms for seer breast cancer dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.Conclusion: The simulation results shows REPTree classifier classifies the data with 93.63% accuracy and minimum RMSE of 0.1628 REPTree algorithm consumes less time to build the model with 0.929 ROC and 0.959 PRC values. By comparing classification results, we confirm that a REPTree algorithm is better than other classification algorithms for SEER dataset.

Download Full-text

Holo entropy enabled decision tree classifier for breast cancer diagnosis using wisconsin (prognostic) data set

2017 7th International Conference on Communication Systems and Network Technologies (CSNT) ◽

10.1109/csnt.2017.8418532 ◽

2017 ◽

Cited By ~ 1

Author(s):

Shabina Sayed ◽

Shoeb Ahmed ◽

Rakesh Poonia

Keyword(s):

Breast Cancer ◽

Decision Tree ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Decision Tree Classifier ◽

Data Set ◽

Tree Classifier ◽

Prognostic Data

Download Full-text

Resnet Based Feature Extraction with Decision Tree Classifier for Classificaton of Mammogram Images

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i2.1136 ◽

2021 ◽

Vol 12 (2) ◽

pp. 1147-1153

Author(s):

T. Sathya Priya, Et. al.

Keyword(s):

Breast Cancer ◽

Feature Extraction ◽

Decision Tree ◽

Breast Cancer Diagnosis ◽

Classification Performance ◽

Decision Tree Classifier ◽

Tree Classifier ◽

In The Beginning ◽

Sensitivity Specificity ◽

Mammogram Images

Right now, breast cancer is considered as a most important health problem among women over the world. The detection of breast cancer in the beginning stage can reduce the mortality rate to a considerable extent. Mammogram is an effective and regularly used technique for the detection and screening of breast cancer. The advanced deep learning (DL) techniques are utilized by radiologists for accurate finding and classification of medical images. This paper develops a new deep segmentation with residual network (DS-RN) based breast cancer diagnosis model using mammogram images. The presented DS-RN model involves preprocessing, Faster Region based Convolution Neural Network (R-CNN) (Faster R-CNN) with Inception v2 model based segmentation, feature extraction and classification. To classify the mammogram images, decision tree (DT) classifier model is used. A detailed simulation process is performed to ensure the betterment of the presented model on the Mini-MIAS dataset. The obtained experimental values stated that the DS-RN model has reached to a maximum classification performance with the maximum sensitivity, specificity, accuracy and F-Measure of 98.15%, 100%, 98.86% and 99.07% respectively.

Download Full-text