Remote Diagnosis and Triaging Model for Skin Cancer Using EfficientNet and Extreme Gradient Boosting

Complexity ◽

10.1155/2021/5591614 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Irfan Ullah Khan ◽

Nida Aslam ◽

Talha Anwar ◽

Sumayh S. Aljameel ◽

Mohib Ullah ◽

...

Keyword(s):

Skin Cancer ◽

Skin Lesion ◽

Clinical Data ◽

Cancer Diagnosis ◽

Gradient Boosting ◽

Automated Diagnosis ◽

Data Set ◽

Diagnosis System ◽

Extreme Gradient Boosting ◽

The Impact

Due to the successful application of machine learning techniques in several fields, automated diagnosis system in healthcare has been increasing at a high rate. The aim of the study is to propose an automated skin cancer diagnosis and triaging model and to explore the impact of integrating the clinical features in the diagnosis and enhance the outcomes achieved by the literature study. We used an ensemble-learning framework, consisting of the EfficientNetB3 deep learning model for skin lesion analysis and Extreme Gradient Boosting (XGB) for clinical data. The study used PAD-UFES-20 data set consisting of six unbalanced categories of skin cancer. To overcome the data imbalance, we used data augmentation. Experiments were conducted using skin lesion merely and the combination of skin lesion and clinical data. We found that integration of clinical data with skin lesions enhances automated diagnosis accuracy. Moreover, the proposed model outperformed the results achieved by the previous study for the PAD-UFES-20 data set with an accuracy of 0.78, precision of 0.89, recall of 0.86, and F1 of 0.88. In conclusion, the study provides an improved automated diagnosis system to aid the healthcare professional and patients for skin cancer diagnosis and remote triaging.

Download Full-text

Predictability of Mortality in Patients With Myocardial Injury After Noncardiac Surgery Based on Perioperative Factors via Machine Learning: Retrospective Study

JMIR Medical Informatics ◽

10.2196/32771 ◽

2021 ◽

Vol 9 (10) ◽

pp. e32771

Author(s):

Seo Jeong Shin ◽

Jungchan Park ◽

Seung-Hwa Lee ◽

Kwangmo Yang ◽

Rae Woong Park

Keyword(s):

Machine Learning ◽

Clinical Data ◽

Myocardial Injury ◽

Learning Algorithms ◽

Noncardiac Surgery ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Data Set ◽

Extreme Gradient Boosting ◽

The Impact

Background Myocardial injury after noncardiac surgery (MINS) is associated with increased postoperative mortality, but the relevant perioperative factors that contribute to the mortality of patients with MINS have not been fully evaluated. Objective To establish a comprehensive body of knowledge relating to patients with MINS, we researched the best performing predictive model based on machine learning algorithms. Methods Using clinical data from 7629 patients with MINS from the clinical data warehouse, we evaluated 8 machine learning algorithms for accuracy, precision, recall, F1 score, area under the receiver operating characteristic (AUROC) curve, and area under the precision-recall curve to investigate the best model for predicting mortality. Feature importance and Shapley Additive Explanations values were analyzed to explain the role of each clinical factor in patients with MINS. Results Extreme gradient boosting outperformed the other models. The model showed an AUROC of 0.923 (95% CI 0.916-0.930). The AUROC of the model did not decrease in the test data set (0.894, 95% CI 0.86-0.922; P=.06). Antiplatelet drugs prescription, elevated C-reactive protein level, and beta blocker prescription were associated with reduced 30-day mortality. Conclusions Predicting the mortality of patients with MINS was shown to be feasible using machine learning. By analyzing the impact of predictors, markers that should be cautiously monitored by clinicians may be identified.

Download Full-text

Predictability of Mortality in Patients with Myocardial Injury after Noncardiac Surgery Based on Perioperative factors via Machine Learning (Preprint)

10.2196/preprints.32771 ◽

2021 ◽

Author(s):

Seo Jeong Shin ◽

Jungchan Park ◽

Seung-Hwa Lee ◽

Kwangmo Yang ◽

Rae Woong Park

Keyword(s):

Machine Learning ◽

Clinical Data ◽

Myocardial Injury ◽

Learning Algorithms ◽

Postoperative Mortality ◽

Noncardiac Surgery ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Extreme Gradient Boosting ◽

The Impact

BACKGROUND Myocardial injury after noncardiac surgery (MINS) is associated with increased postoperative mortality, but the relevant perioperative factors that contribute to the mortality of patients with MINS have not been fully evaluated. OBJECTIVE To establish a comprehensive body of knowledge relating to patients with MINS, we researched the best performing predictive model based on machine learning algorithms. METHODS Using clinical data for 7,629 patients with MINS from the Clinical Data Warehouse, we evaluated eight machine learning algorithms for accuracy, precision, recall, F1 score, AUROC (area under the receiver operating characteristic) curve, and area under the precision-recall curve to investigate the best model for predicting mortality. Feature importance and SHapley Additive exPlanations value were analyzed to explain the role of each clinical factor in patients with MINS. RESULTS Extreme gradient boosting outperformed the other models. The model showed AUROC of 0.923 (95% confidence interval (CI): 0.916–0.930). The AUROC of the model was not decreased in the test dataset (0.894, 95% CI: 0.86–0.922) (P =.06). Antiplatelet drugs prescription, elevated C-reactive protein level, and beta blocker prescription were associated with reduced 30-day mortality. CONCLUSIONS Predicting mortality of patients with MINS was shown to be feasible using machine learning. By analyzing the impact of predictors, markers that should be cautiously monitored by clinicians may be identified.

Download Full-text

Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival

Scientific Reports ◽

10.1038/s41598-021-86327-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Arturo Moncada-Torres ◽

Marissa C. van Maaren ◽

Mathijs P. Hendriks ◽

Sabine Siesling ◽

Gijs Geleijnse

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Explicit Knowledge ◽

Cox Regression ◽

Metastatic Breast ◽

Gradient Boosting ◽

Support Vector ◽

Netherlands Cancer Registry ◽

Extreme Gradient Boosting ◽

The Impact

AbstractCox Proportional Hazards (CPH) analysis is the standard for survival analysis in oncology. Recently, several machine learning (ML) techniques have been adapted for this task. Although they have shown to yield results at least as good as classical methods, they are often disregarded because of their lack of transparency and little to no explainability, which are key for their adoption in clinical settings. In this paper, we used data from the Netherlands Cancer Registry of 36,658 non-metastatic breast cancer patients to compare the performance of CPH with ML techniques (Random Survival Forests, Survival Support Vector Machines, and Extreme Gradient Boosting [XGB]) in predicting survival using the $$c$$ c -index. We demonstrated that in our dataset, ML-based models can perform at least as good as the classical CPH regression ($$c$$ c -index $$\sim \,0.63$$ ∼ 0.63 ), and in the case of XGB even better ($$c$$ c -index $$\sim 0.73$$ ∼ 0.73 ). Furthermore, we used Shapley Additive Explanation (SHAP) values to explain the models’ predictions. We concluded that the difference in performance can be attributed to XGB’s ability to model nonlinearities and complex interactions. We also investigated the impact of specific features on the models’ predictions as well as their corresponding insights. Lastly, we showed that explainable ML can generate explicit knowledge of how models make their predictions, which is crucial in increasing the trust and adoption of innovative ML techniques in oncology and healthcare overall.

Download Full-text

Exploring the Mechanism of Crashes with Autonomous Vehicles Using Machine Learning

Mathematical Problems in Engineering ◽

10.1155/2021/5524356 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hengrui Chen ◽

Hong Chen ◽

Ruiyu Zhou ◽

Zhizhen Liu ◽

Xiaoke Sun

Keyword(s):

Machine Learning ◽

Autonomous Vehicles ◽

Classification And Regression Tree ◽

Gradient Boosting ◽

Support Vector ◽

Crash Severity ◽

Apriori Algorithm ◽

Driving Mode ◽

Extreme Gradient Boosting ◽

The Impact

The safety issue has become a critical obstacle that cannot be ignored in the marketization of autonomous vehicles (AVs). The objective of this study is to explore the mechanism of AV-involved crashes and analyze the impact of each feature on crash severity. We use the Apriori algorithm to explore the causal relationship between multiple factors to explore the mechanism of crashes. We use various machine learning models, including support vector machine (SVM), classification and regression tree (CART), and eXtreme Gradient Boosting (XGBoost), to analyze the crash severity. Besides, we apply the Shapley Additive Explanations (SHAP) to interpret the importance of each factor. The results indicate that XGBoost obtains the best result (recall = 75%; G-mean = 67.82%). Both XGBoost and Apriori algorithm effectively provided meaningful insights about AV-involved crash characteristics and their relationship. Among all these features, vehicle damage, weather conditions, accident location, and driving mode are the most critical features. We found that most rear-end crashes are conventional vehicles bumping into the rear of AVs. Drivers should be extremely cautious when driving in fog, snow, and insufficient light. Besides, drivers should be careful when driving near intersections, especially in the autonomous driving mode.

Download Full-text

Good, Bad, and Ugly: Partner Support and Quality of Life Among Couples Facing Skin Cancer

Innovation in Aging ◽

10.1093/geroni/igaa057.1390 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 430-430

Author(s):

Laura Butner-Kozimor ◽

Jyoti Savla

Keyword(s):

Quality Of Life ◽

Skin Cancer ◽

Cancer Diagnosis ◽

Negative Relationship ◽

Perceived Support ◽

Older Couples ◽

Path Analyses ◽

The Impact ◽

Negative Support

Abstract When older adults in partnered relationships face a skin cancer diagnosis of one partner, couples may rely on one another for support. Previous studies have found that perceived support can influence one’s adjustment to the stressors associated with the skin cancer diagnosis, as well as influence the overall quality of life. Using dyadic data from 30 older couples (Mage = 70; SD = 7.25), this study examined positive and negative relationship-focused support strategies each partner provided and effects on the dyad’s quality of life. Dyadic path analyses simultaneously examined the impact of support received by one’s partner and its association with their own quality of life (actor effects) and their partner’s quality of life (partner effects). Positive support received by either partner, in the form of active engagement, was not associated with quality of life. In contrast, negative support in the form of protective buffering received from supporting partners was associated with poorer quality of life for themselves (β = -.37, p = .05) as well as for partners with skin cancer (β = -.43, p = .01). Similarly, overprotection, also a negative support strategy, by supporting partners was associated with poorer quality of life for partners with skin cancer (β = -.63, p < .001). Findings illustrate that not all types of support are beneficial for the overall couple relationship and couple outcomes. Implications for practice and interventions for older couples facing a cancer diagnosis will be discussed.

Download Full-text

Proposed Threshold Algorithm for Accurate Segmentation for Skin Lesion

International Journal of Biomedical and Clinical Engineering ◽

10.4018/ijbce.2015070104 ◽

2015 ◽

Vol 4 (2) ◽

pp. 40-47

Author(s):

T. Y. Satheesha ◽

D. Sathyanarayana ◽

M. N. Giri Prasad

Keyword(s):

Skin Cancer ◽

Skin Lesion ◽

Skin Lesions ◽

Large Set ◽

Threshold Values ◽

Diagnostic Systems ◽

Automated Diagnosis ◽

Algorithm Efficiency ◽

Threshold Algorithms ◽

Threshold Technique

Automated diagnosis of skin cancer can be easily achieved only by effective segmentation of skin lesion. But this is a highly challenging task due to the presence of intensity variations in the images of skin lesions. The authors here, have presented a histogram analysis based fuzzy C mean threshold technique to overcome the drawbacks. This not only reduces the computational complexity but also unifies advantages of soft and hard threshold algorithms. Calculation of threshold values even the presence of abrupt intensity variations is simplified. Segmentation of skin lesions is easily achieved, in a more efficient way in the following algorithm. The experimental verification here is done on a large set of skin lesion images containing every possible artifacts which highly contributes to reversed segmentation outputs. This algorithm efficiency was measured based on a comparison with other prominent threshold methods. This approach has performed reasonably well and can be implemented in the expert skin cancer diagnostic systems

Download Full-text

Exploiting Rules to Enhance Machine Learning in Extracting Information From Multi-Institutional Prostate Pathology Reports

JCO Clinical Cancer Informatics ◽

10.1200/cci.20.00028 ◽

2020 ◽

pp. 865-874

Author(s):

Enrico Santus ◽

Tal Schuster ◽

Amir M. Tahmasebi ◽

Clara Li ◽

Adam Yala ◽

...

Keyword(s):

Machine Learning ◽

Hybrid Systems ◽

High Performance ◽

Feature Model ◽

Training Data ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Extreme Gradient Boosting ◽

Pathology Reports

PURPOSE Literature on clinical note mining has highlighted the superiority of machine learning (ML) over hand-crafted rules. Nevertheless, most studies assume the availability of large training sets, which is rarely the case. For this reason, in the clinical setting, rules are still common. We suggest 2 methods to leverage the knowledge encoded in pre-existing rules to inform ML decisions and obtain high performance, even with scarce annotations. METHODS We collected 501 prostate pathology reports from 6 American hospitals. Reports were split into 2,711 core segments, annotated with 20 attributes describing the histology, grade, extension, and location of tumors. The data set was split by institutions to generate a cross-institutional evaluation setting. We assessed 4 systems, namely a rule-based approach, an ML model, and 2 hybrid systems integrating the previous methods: a Rule as Feature model and a Classifier Confidence model. Several ML algorithms were tested, including logistic regression (LR), support vector machine (SVM), and eXtreme gradient boosting (XGB). RESULTS When training on data from a single institution, LR lags behind the rules by 3.5% (F1 score: 92.2% v 95.7%). Hybrid models, instead, obtain competitive results, with Classifier Confidence outperforming the rules by +0.5% (96.2%). When a larger amount of data from multiple institutions is used, LR improves by +1.5% over the rules (97.2%), whereas hybrid systems obtain +2.2% for Rule as Feature (97.7%) and +2.6% for Classifier Confidence (98.3%). Replacing LR with SVM or XGB yielded similar performance gains. CONCLUSION We developed methods to use pre-existing handcrafted rules to inform ML algorithms. These hybrid systems obtain better performance than either rules or ML models alone, even when training data are limited.

Download Full-text

Cost-Sensitive Extreme Gradient Boosting for Imbalanced Classification of Breast Cancer Diagnosis

2020 10th IEEE International Conference on Control System, Computing and Engineering (ICCSCE) ◽

10.1109/iccsce50387.2020.9204948 ◽

2020 ◽

Author(s):

Manop Phankokkruad

Keyword(s):

Breast Cancer ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Gradient Boosting ◽

Imbalanced Classification ◽

Extreme Gradient Boosting

Download Full-text

Noise Reduction Power Stealing Detection Model Based on Self-Balanced Data Set

Energies ◽

10.3390/en13071763 ◽

2020 ◽

Vol 13 (7) ◽

pp. 1763 ◽

Cited By ~ 1

Author(s):

Haiqing Liu ◽

Zhiqiao Li ◽

Yuancheng Li

Keyword(s):

Power Consumption ◽

Noise Reduction ◽

Random Noise ◽

Imbalanced Data ◽

Gradient Boosting ◽

Data Set ◽

Detection Model ◽

Adversarial Network ◽

Consumption Data ◽

The Impact

In recent years, various types of power theft incidents have occurred frequently, and the training of the power-stealing detection model is susceptible to the influence of the imbalanced data set and the data noise, which leads to errors in power-stealing detection. Therefore, a power-stealing detection model is proposed, which is based on Improved Conditional Generation Adversarial Network (CWGAN), Stacked Convolution Noise Reduction Autoencoder (SCDAE) and Lightweight Gradient Boosting Decision Machine (LightGBM). The model performs Generation- Adversarial operations on the original unbalanced power consumption data to achieve the balance of electricity data, and avoids the interference of the imbalanced data set on classifier training. In addition, the convolution method is used to stack the noise reduction auto-encoder to achieve dimension reduction of power consumption data, extract data features and reduce the impact of random noise. Finally, LightGBM is used for power theft detection. The experiments show that CWGAN can effectively balance the distribution of power consumption data. Comparing the detection indicators of the power-stealing model with various advanced power-stealing models on the same data set, it is finally proved that the proposed model is superior to other models in the detection of power stealing.

Download Full-text

Efficiency of Extreme Gradient Boosting for Imbalanced Land Cover Classification Using an Extended Margin and Disagreement Performance

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8070315 ◽

2019 ◽

Vol 8 (7) ◽

pp. 315 ◽

Cited By ~ 1

Author(s):

Fei Sun ◽

Run Wang ◽

Bo Wan ◽

Yanjun Su ◽

Qinghua Guo ◽

...

Keyword(s):

Land Cover ◽

Error Component ◽

Training Data ◽

Gradient Boosting ◽

Correct Classification ◽

Imbalanced Learning ◽

Minority Class ◽

Extreme Gradient Boosting ◽

Spectral Separability ◽

The Impact

Imbalanced learning is a methodological challenge in remote sensing communities, especially in complex areas where the spectral similarity exists between land covers. Obtaining high-confidence classification results for imbalanced class issues is highly important in practice. In this paper, extreme gradient boosting (XGB), a novel tree-based ensemble system, is employed to classify the land cover types in Very-high resolution (VHR) images with imbalanced training data. We introduce an extended margin criterion and disagreement performance to evaluate the efficiency of XGB in imbalanced learning situations and examine the effect of minority class spectral separability on model performance. The results suggest that the uncertainty of XGB associated with correct classification is stable. The average probability-based margin of correct classification provided by XGB is 0.82, which is about 46.30% higher than that by random forest (RF) method (0.56). Moreover, the performance uncertainty of XGB is insensitive to spectral separability after the sample imbalance reached a certain level (minority:majority > 10:100). The impact of sample imbalance on the minority class is also related to its spectral separability, and XGB performs better than RF in terms of user accuracy for the minority class with imperfect separability. The disagreement components of XGB are better and more stable than RF with imbalanced samples, especially for complex areas with more types. In addition, appropriate sample imbalance helps to improve the trade-off between the recognition accuracy of XGB and the sample cost. According to our analysis, this margin-based uncertainty assessment and disagreement performance can help users identify the confidence level and error component in similar classification performance (overall, producer, and user accuracies).

Download Full-text