Handcrafted and Deep Learning-Based Radiomic Models Can Distinguish GBM from Brain Metastasis

Journal of Oncology ◽

10.1155/2021/5518717 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Zhiyuan Liu ◽

Zekun Jiang ◽

Li Meng ◽

Jun Yang ◽

Ying Liu ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Brain Metastasis ◽

Area Under The Curve ◽

Classification Performance ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Test Dataset ◽

Contrast Enhanced ◽

Magnetic Resonance Imaging Mri

Objective. The purpose of this study was to investigate the feasibility of applying handcrafted radiomics (HCR) and deep learning-based radiomics (DLR) for the accurate preoperative classification of glioblastoma (GBM) and solitary brain metastasis (BM). Methods. A retrospective analysis of the magnetic resonance imaging (MRI) data of 140 patients (110 in the training dataset and 30 in the test dataset) with GBM and 128 patients (98 in the training dataset and 30 in the test dataset) with BM confirmed by surgical pathology was performed. The regions of interest (ROIs) on T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), and contrast-enhanced T1WI (T1CE) were drawn manually, and then, HCR and DLR analyses were performed. On this basis, different machine learning algorithms were implemented and compared to find the optimal modeling method. The final classifiers were identified and validated for different MRI modalities using HCR features and HCR + DLR features. By analyzing the receiver operating characteristic (ROC) curve, the area under the curve (AUC), accuracy, sensitivity, and specificity were calculated to evaluate the predictive efficacy of different methods. Results. In multiclassifier modeling, random forest modeling showed the best distinguishing performance among all MRI modalities. HCR models already showed good results for distinguishing between the two types of brain tumors in the test dataset (T1WI, AUC = 0.86; T2WI, AUC = 0.76; T1CE, AUC = 0.93). By adding DLR features, all AUCs showed significant improvement (T1WI, AUC = 0.87; T2WI, AUC = 0.80; T1CE, AUC = 0.97; p < 0.05 ). The T1CE-based radiomic model showed the best classification performance (AUC = 0.99 in the training dataset and AUC = 0.97 in the test dataset), surpassing the other MRI modalities ( p < 0.05 ). The multimodality radiomic model also showed robust performance (AUC = 1 in the training dataset and AUC = 0.84 in the test dataset). Conclusion. Machine learning models using MRI radiomic features can help distinguish GBM from BM effectively, especially the combination of HCR and DLR features.

Download Full-text

Exploiting Machine Learning Algorithms and Methods for the Prediction of Agitated Delirium After Cardiac Surgery: Models Development and Validation Study

JMIR Medical Informatics ◽

10.2196/14993 ◽

2019 ◽

Vol 7 (4) ◽

pp. e14993

Author(s):

Hani Nabeel Mufti ◽

Gregory Marshal Hirsch ◽

Samina Raza Abidi ◽

Syed Sibte Raza Abidi

Keyword(s):

Machine Learning ◽

Cardiac Surgery ◽

Postoperative Delirium ◽

Class Imbalance ◽

Area Under The Curve ◽

Training Dataset ◽

Learning Methods ◽

Test Dataset ◽

Machine Learning Methods ◽

Patients At Risk

Background Delirium is a temporary mental disorder that occasionally affects patients undergoing surgery, especially cardiac surgery. It is strongly associated with major adverse events, which in turn leads to increased cost and poor outcomes (eg, need for nursing home due to cognitive impairment, stroke, and death). The ability to foresee patients at risk of delirium will guide the timely initiation of multimodal preventive interventions, which will aid in reducing the burden and negative consequences associated with delirium. Several studies have focused on the prediction of delirium. However, the number of studies in cardiac surgical patients that have used machine learning methods is very limited. Objective This study aimed to explore the application of several machine learning predictive models that can pre-emptively predict delirium in patients undergoing cardiac surgery and compare their performance. Methods We investigated a number of machine learning methods to develop models that can predict delirium after cardiac surgery. A clinical dataset comprising over 5000 actual patients who underwent cardiac surgery in a single center was used to develop the models using logistic regression, artificial neural networks (ANN), support vector machines (SVM), Bayesian belief networks (BBN), naïve Bayesian, random forest, and decision trees. Results Only 507 out of 5584 patients (11.4%) developed delirium. We addressed the underlying class imbalance, using random undersampling, in the training dataset. The final prediction performance was validated on a separate test dataset. Owing to the target class imbalance, several measures were used to evaluate algorithm’s performance for the delirium class on the test dataset. Out of the selected algorithms, the SVM algorithm had the best F1 score for positive cases, kappa, and positive predictive value (40.2%, 29.3%, and 29.7%, respectively) with a P=.01, .03, .02, respectively. The ANN had the best receiver-operator area-under the curve (78.2%; P=.03). The BBN had the best precision-recall area-under the curve for detecting positive cases (30.4%; P=.03). Conclusions Although delirium is inherently complex, preventive measures to mitigate its negative effect can be applied proactively if patients at risk are prospectively identified. Our results highlight 2 important points: (1) addressing class imbalance on the training dataset will augment machine learning model’s performance in identifying patients likely to develop postoperative delirium, and (2) as the prediction of postoperative delirium is difficult because it is multifactorial and has complex pathophysiology, applying machine learning methods (complex or simple) may improve the prediction by revealing hidden patterns, which will lead to cost reduction by prevention of complications and will optimize patients’ outcomes.

Download Full-text

Exploiting Machine Learning Algorithms and Methods for the Prediction of Agitated Delirium After Cardiac Surgery: Models Development and Validation Study (Preprint)

10.2196/preprints.14993 ◽

2019 ◽

Author(s):

Hani Nabeel Mufti ◽

Gregory Marshal Hirsch ◽

Samina Raza Abidi ◽

Syed Sibte Raza Abidi

Keyword(s):

Machine Learning ◽

Cardiac Surgery ◽

Postoperative Delirium ◽

Class Imbalance ◽

Area Under The Curve ◽

Training Dataset ◽

Learning Methods ◽

Test Dataset ◽

Machine Learning Methods ◽

Patients At Risk

BACKGROUND Delirium is a temporary mental disorder that occasionally affects patients undergoing surgery, especially cardiac surgery. It is strongly associated with major adverse events, which in turn leads to increased cost and poor outcomes (eg, need for nursing home due to cognitive impairment, stroke, and death). The ability to foresee patients at risk of delirium will guide the timely initiation of multimodal preventive interventions, which will aid in reducing the burden and negative consequences associated with delirium. Several studies have focused on the prediction of delirium. However, the number of studies in cardiac surgical patients that have used machine learning methods is very limited. OBJECTIVE This study aimed to explore the application of several machine learning predictive models that can pre-emptively predict delirium in patients undergoing cardiac surgery and compare their performance. METHODS We investigated a number of machine learning methods to develop models that can predict delirium after cardiac surgery. A clinical dataset comprising over 5000 actual patients who underwent cardiac surgery in a single center was used to develop the models using logistic regression, artificial neural networks (ANN), support vector machines (SVM), Bayesian belief networks (BBN), naïve Bayesian, random forest, and decision trees. RESULTS Only 507 out of 5584 patients (11.4%) developed delirium. We addressed the underlying class imbalance, using random undersampling, in the training dataset. The final prediction performance was validated on a separate test dataset. Owing to the target class imbalance, several measures were used to evaluate algorithm’s performance for the delirium class on the test dataset. Out of the selected algorithms, the SVM algorithm had the best F1 score for positive cases, kappa, and positive predictive value (40.2%, 29.3%, and 29.7%, respectively) with a <italic>P</italic>=.01, .03, .02, respectively. The ANN had the best receiver-operator area-under the curve (78.2%; <italic>P</italic>=.03). The BBN had the best precision-recall area-under the curve for detecting positive cases (30.4%; <italic>P</italic>=.03). CONCLUSIONS Although delirium is inherently complex, preventive measures to mitigate its negative effect can be applied proactively if patients at risk are prospectively identified. Our results highlight 2 important points: (1) addressing class imbalance on the training dataset will augment machine learning model’s performance in identifying patients likely to develop postoperative delirium, and (2) as the prediction of postoperative delirium is difficult because it is multifactorial and has complex pathophysiology, applying machine learning methods (complex or simple) may improve the prediction by revealing hidden patterns, which will lead to cost reduction by prevention of complications and will optimize patients’ outcomes.

Download Full-text

Streamlining Quality Review of Mass Spectrometry Data in the Clinical Laboratory by Use of Machine Learning

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2018-0238-oa ◽

2019 ◽

Vol 143 (8) ◽

pp. 990-998 ◽

Cited By ~ 2

Author(s):

Min Yu ◽

Lindsay A. L. Bazydlo ◽

David E. Bruns ◽

James H. Harrison

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Turnaround Time ◽

Machine Learning Algorithms ◽

Classification Model ◽

Supervised Machine Learning ◽

Training Dataset ◽

Support Vector ◽

Test Dataset ◽

Manual Review

Context.— Turnaround time and productivity of clinical mass spectrometric (MS) testing are hampered by time-consuming manual review of the analytical quality of MS data before release of patient results. Objective.— To determine whether a classification model created by using standard machine learning algorithms can verify analytically acceptable MS results and thereby reduce manual review requirements. Design.— We obtained retrospective data from gas chromatography–MS analyses of 11-nor-9-carboxy-delta-9-tetrahydrocannabinol (THC-COOH) in 1267 urine samples. The data for each sample had been labeled previously as either analytically unacceptable or acceptable by manual review. The dataset was randomly split into training and test sets (848 and 419 samples, respectively), maintaining equal proportions of acceptable (90%) and unacceptable (10%) results in each set. We used stratified 10-fold cross-validation in assessing the abilities of 6 supervised machine learning algorithms to distinguish unacceptable from acceptable assay results in the training dataset. The classifier with the highest recall was used to build a final model, and its performance was evaluated against the test dataset. Results.— In comparison testing of the 6 classifiers, a model based on the Support Vector Machines algorithm yielded the highest recall and acceptable precision. After optimization, this model correctly identified all unacceptable results in the test dataset (100% recall) with a precision of 81%. Conclusions.— Automated data review identified all analytically unacceptable assays in the test dataset, while reducing the manual review requirement by about 87%. This automation strategy can focus manual review only on assays likely to be problematic, allowing improved throughput and turnaround time without reducing quality.

Download Full-text

Predicting Tumor Budding Status in Cervical Cancer Using MRI Radiomics: Linking Imaging Biomarkers to Histologic Characteristics

Cancers ◽

10.3390/cancers13205140 ◽

2021 ◽

Vol 13 (20) ◽

pp. 5140

Author(s):

Gun Oh Chong ◽

Shin-Hyung Park ◽

Nora Jee-Young Park ◽

Bong Kyung Bae ◽

Yoon Hee Lee ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Cervical Cancer ◽

Area Under The Curve ◽

Tumor Budding ◽

Training Dataset ◽

Imaging Biomarkers ◽

Support Vector ◽

Test Dataset ◽

Machine Learning Classifiers

Background: Our previous study demonstrated that tumor budding (TB) status was associated with inferior overall survival in cervical cancer. The purpose of this study is to evaluate whether radiomic features can predict TB status in cervical cancer patients. Methods: Seventy-four patients with cervical cancer who underwent preoperative MRI and radical hysterectomy from 2011 to 2015 at our institution were enrolled. The patients were randomly allocated to the training dataset (n = 48) and test dataset (n = 26). Tumors were segmented on axial gadolinium-enhanced T1- and T2-weighted images. A total of 2074 radiomic features were extracted. Four machine learning classifiers, including logistic regression (LR), random forest (RF), support vector machine (SVM), and neural network (NN), were used. The trained models were validated on the test dataset. Results: Twenty radiomic features were selected; all were features from filtered-images and 85% were texture-related features. The area under the curve values and accuracy of the models by LR, RF, SVM and NN were 0.742 and 0.769, 0.782 and 0.731, 0.849 and 0.885, and 0.891 and 0.731, respectively, in the test dataset. Conclusion: MRI-based radiomic features could predict TB status in patients with cervical cancer.

Download Full-text

A deep learning-based quality assessment model of collaboratively edited documents: A case study of Wikipedia

Journal of Information Science ◽

10.1177/0165551519877646 ◽

2019 ◽

pp. 016555151987764

Author(s):

Ping Wang ◽

Xiaodan Li ◽

Renli Wu

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Complete Information ◽

Classification Performance ◽

Machine Learning Algorithms ◽

Assessment Model ◽

Learning Models ◽

Proposed Model

Wikipedia is becoming increasingly critical in helping people obtain information and knowledge. Its leading advantage is that users can not only access information but also modify it. However, this presents a challenging issue: how can we measure the quality of a Wikipedia article? The existing approaches assess Wikipedia quality by statistical models or traditional machine learning algorithms. However, their performance is not satisfactory. Moreover, most existing models fail to extract complete information from articles, which degrades the model’s performance. In this article, we first survey related works and summarise a comprehensive feature framework. Then, state-of-the-art deep learning models are introduced and applied to assess Wikipedia quality. Finally, a comparison among deep learning models and traditional machine learning models is conducted to validate the effectiveness of the proposed model. The models are compared extensively in terms of their training and classification performance. Moreover, the importance of each feature and the importance of different feature sets are analysed separately.

Download Full-text

Predicting Central Serous Chorioretinopathy Recurrence Using Machine Learning

Frontiers in Physiology ◽

10.3389/fphys.2021.649316 ◽

2021 ◽

Vol 12 ◽

Author(s):

Fabao Xu ◽

Cheng Wan ◽

Lanqin Zhao ◽

Qijing You ◽

Yifan Xiang ◽

...

Keyword(s):

Machine Learning ◽

Central Serous Chorioretinopathy ◽

Machine Learning Algorithms ◽

Simplified Model ◽

Training Dataset ◽

Ensemble Model ◽

Imaging Features ◽

Test Dataset ◽

Recurrence Prediction ◽

External Test

Purpose: To predict central serous chorioretinopathy (CSC) recurrence 3 and 6 months after laser treatment by using machine learning.Methods: Clinical and imaging features of 461 patients (480 eyes) with CSC were collected at Zhongshan Ophthalmic Center (ZOC) and Xiamen Eye Center (XEC). The ZOC data (416 eyes of 401 patients) were used as the training dataset and the internal test dataset, while the XEC data (64 eyes of 60 patients) were used as the external test dataset. Six different machine learning algorithms and an ensemble model were trained to predict recurrence in patients with CSC. After completing the initial detailed investigation, we designed a simplified model using only clinical data and OCT features.Results: The ensemble model exhibited the best performance among the six algorithms, with accuracies of 0.941 (internal test dataset) and 0.970 (external test dataset) at 3 months and 0.903 (internal test dataset) and 1.000 (external test dataset) at 6 months. The simplified model showed a comparable level of predictive power.Conclusion: Machine learning achieves high accuracies in predicting the recurrence of CSC patients. The application of an intelligent recurrence prediction model for patients with CSC can potentially facilitate recurrence factor identification and precise individualized interventions.

Download Full-text

Automated age estimation of young individuals based on 3D knee MRI using deep learning

International Journal of Legal Medicine ◽

10.1007/s00414-020-02465-z ◽

2020 ◽

Author(s):

Markus Auf der Mauer ◽

Eilin Jopp-van Well ◽

Jochen Herrmann ◽

Michael Groth ◽

Michael M. Morlock ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Age Estimation ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Chronological Age ◽

Computer Based ◽

Young Subjects ◽

Magnetic Resonance Imaging Mri ◽

Age Range

AbstractAge estimation is a crucial element of forensic medicine to assess the chronological age of living individuals without or lacking valid legal documentation. Methods used in practice are labor-intensive, subjective, and frequently comprise radiation exposure. Recently, also non-invasive methods using magnetic resonance imaging (MRI) have evaluated and confirmed a correlation between growth plate ossification in long bones and the chronological age of young subjects. However, automated and user-independent approaches are required to perform reliable assessments on large datasets. The aim of this study was to develop a fully automated and computer-based method for age estimation based on 3D knee MRIs using machine learning. The proposed solution is based on three parts: image-preprocessing, bone segmentation, and age estimation. A total of 185 coronal and 404 sagittal MR volumes from Caucasian male subjects in the age range of 13 and 21 years were available. The best result of the fivefold cross-validation was a mean absolute error of 0.67 ± 0.49 years in age regression and an accuracy of 90.9%, a sensitivity of 88.6%, and a specificity of 94.2% in classification (18-year age limit) using a combination of convolutional neural networks and tree-based machine learning algorithms. The potential of deep learning for age estimation is reflected in the results and can be further improved if it is trained on even larger and more diverse datasets.

Download Full-text

Automatic Classification of COVID-19 using CT-Scan Images

Acta Scientiarum Technology ◽

10.4025/actascitechnol.v43i1.55189 ◽

2021 ◽

Vol 43 ◽

pp. e55189

Author(s):

Hatice Catal Reis

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ct Scan ◽

Mean Squared Error ◽

Close Contact ◽

Learning Algorithms ◽

Area Under The Curve ◽

Point Of View ◽

Machine Learning Algorithms ◽

Engineering Sciences

Medicine and engineering sciences have been working in close contact for common purposes. Machine learning algorithms are used in the medical field for early diagnosis prediction. The major aim of this study is to evaluate machine learning algorithms and deep learning algorithms using computed tomography scan (CT-scan) images for automated detection of the coronavirus disease 2019 (COVID-19) patients. We obtained seven hundred and fifty-seven (757) CT-scan images from a public platform. We applied four automated traditional classification methods to predict COVID-19 using deep learning and machine learning. These algorithms are SVM, AdaBoost, NASNetMobile, and InceptionV3. Comparative analyses are presented among the four models by considering metric performance factors to find the best model. The results show that the InceptionV3 model achieves better performance in terms of accuracy, precision, recall, Cohen’s kappa, F1- score, root mean squared error (RMSE), and receiver operating characteristic- area under the curve (ROC-AUC), in comparison with the other Covid-19 classifiers. Accordingly, the InceptionV3 approach is recommended for the automatic diagnosis of Covid-19 and assessments. This research can present a second point of view to medical experts and it can save time for researchers as the performance of standard machine learning methods in detecting COVID-19 is evaluated.

Download Full-text

Derivation of Respiratory Metrics in Health and Asthma; Machine Learning Methodology (Preprint)

10.2196/preprints.25178 ◽

2020 ◽

Author(s):

Joseph Prinable ◽

Peter Jones ◽

David Boland ◽

Alistair McEwan ◽

Cindy Thamrin

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Reference Signal ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Local Health ◽

Strongly Correlated ◽

Human Research Ethics ◽

Expiration Time ◽

Wide Variability

BACKGROUND The ability to continuously monitor breathing metrics may have indications for general health as well as respiratory conditions such as asthma. However, few studies have focused on breathing due to a lack of available wearable technologies. OBJECTIVE Examine the performance of two machine learning algorithms in extracting breathing metrics from a finger-based pulse oximeter, which is amenable to long-term monitoring. METHODS Pulse oximetry data was collected from 11 healthy and 11 asthma subjects who breathed at a range of controlled respiratory rates. UNET and Long Short-Term memory (LSTM) algorithms were applied to the data, and results compared against breathing metrics derived from respiratory inductance plethysmography measured simultaneously as a reference. RESULTS The UNET vs LSTM model provided breathing metrics which were strongly correlated with those from the reference signal (all p<0.001, except for inspiratory:expiratory ratio). The following relative mean bias(95% confidence interval) were observed: inspiration time 1.89(-52.95, 56.74)% vs 1.30(-52.15, 54.74)%, expiration time -3.70(-55.21, 47.80)% vs -4.97(-56.84, 46.89)%, inspiratory:expiratory ratio -4.65(-87.18, 77.88)% vs -5.30(-87.07, 76.47)%, inter-breath intervals -2.39(-32.76, 27.97)% vs -3.16(-33.69, 27.36)%, and respiratory rate 2.99(-27.04 to 33.02)% vs 3.69(-27.17 to 34.56)%. CONCLUSIONS Both machine learning models show strongly correlation and good comparability with reference, with low bias though wide variability for deriving breathing metrics in asthma and health cohorts. Future efforts should focus on improvement of performance of these models, e.g. by increasing the size of the training dataset at the lower breathing rates. CLINICALTRIAL Sydney Local Health District Human Research Ethics Committee (#LNR\16\HAWKE99 ethics approval).

Download Full-text

Evaluation of the feasibility of explainable computer-aided detection of cardiomegaly on chest radiographs using deep learning

Scientific Reports ◽

10.1038/s41598-021-96433-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mu Sook Lee ◽

Yong Soo Kim ◽

Minki Kim ◽

Muhammad Usman ◽

Shi Sub Byon ◽

...

Keyword(s):

Deep Learning ◽

Diagnostic Performance ◽

Absolute Error ◽

Training Dataset ◽

Computer Aided Detection ◽

Test Dataset ◽

Cardiothoracic Ratio ◽

Computer Aided ◽

Chest X Ray ◽

Public Datasets

AbstractWe examined the feasibility of explainable computer-aided detection of cardiomegaly in routine clinical practice using segmentation-based methods. Overall, 793 retrospectively acquired posterior–anterior (PA) chest X-ray images (CXRs) of 793 patients were used to train deep learning (DL) models for lung and heart segmentation. The training dataset included PA CXRs from two public datasets and in-house PA CXRs. Two fully automated segmentation-based methods using state-of-the-art DL models for lung and heart segmentation were developed. The diagnostic performance was assessed and the reliability of the automatic cardiothoracic ratio (CTR) calculation was determined using the mean absolute error and paired t-test. The effects of thoracic pathological conditions on performance were assessed using subgroup analysis. One thousand PA CXRs of 1000 patients (480 men, 520 women; mean age 63 ± 23 years) were included. The CTR values derived from the DL models and diagnostic performance exhibited excellent agreement with reference standards for the whole test dataset. Performance of segmentation-based methods differed based on thoracic conditions. When tested using CXRs with lesions obscuring heart borders, the performance was lower than that for other thoracic pathological findings. Thus, segmentation-based methods using DL could detect cardiomegaly; however, the feasibility of computer-aided detection of cardiomegaly without human intervention was limited.

Download Full-text