Machine learning models in breast cancer survival prediction

Mitra Montazeri; Mohadeseh Montazeri; Mahdieh Montazeri; Amin Beigzadeh

doi:10.3233/thc-151071

An Artificial Intelligence-Enabled Pipeline for Medical Domain: Malaysian Breast Cancer Survivorship Cohort as a Case Study

Diagnostics ◽

10.3390/diagnostics11081492 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1492

Author(s):

Mogana Darshini Ganggayah ◽

Sarinder Kaur Dhillon ◽

Tania Islam ◽

Foad Kalhor ◽

Teh Chean Chiang ◽

...

Keyword(s):

Breast Cancer ◽

Quality Of Life ◽

Artificial Intelligence ◽

Machine Learning ◽

Cancer Survival ◽

Breast Cancer Survival ◽

Survival Prediction ◽

Automated Scoring

Automated artificial intelligence (AI) systems enable the integration of different types of data from various sources for clinical decision-making. The aim of this study is to propose a pipeline to develop a fully automated clinician-friendly AI-enabled database platform for breast cancer survival prediction. A case study of breast cancer survival cohort from the University Malaya Medical Centre was used to develop and evaluate the pipeline. A relational database and a fully automated system were developed by integrating the database with analytical modules (machine learning, automated scoring for quality of life, and interactive visualization). The developed pipeline, iSurvive has helped in enhancing data management as well as to visualize important prognostic variables and survival rates. The embedded automated scoring module demonstrated quality of life of patients whereas the interactive visualizations could be used by clinicians to facilitate communication with patients. The pipeline proposed in this study is a one-stop center to manage data, to automate analytics using machine learning, to automate scoring and to produce explainable interactive visuals to enhance clinician-patient communication along the survivorship period to modify behaviours that relate to prognosis. The pipeline proposed can be modelled on any disease not limited to breast cancer.

Download Full-text

Machine learning prediction of breast cancer survival using age, sex, length of stay, mode of diagnosis and location of cancer

Health and Technology ◽

10.1007/s12553-021-00572-4 ◽

2021 ◽

Author(s):

Hilary I. Okagbue ◽

Patience I. Adamu ◽

Pelumi E. Oguntunde ◽

Emmanuela C. M. Obasi ◽

Oluwole A. Odetunmibi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Length Of Stay ◽

Cancer Survival ◽

Breast Cancer Survival

Download Full-text

Prediction of pathologic complete response to neoadjuvant chemotherapy using machine learning models in patients with breast cancer

Breast Cancer Research and Treatment ◽

10.1007/s10549-021-06310-8 ◽

2021 ◽

Author(s):

Ji-Yeon Kim ◽

Eunjoo Jeon ◽

Soonhwan Kwon ◽

Hyungsik Jung ◽

Sunghoon Joo ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Neoadjuvant Chemotherapy ◽

Pathologic Complete Response ◽

Complete Response ◽

Learning Models ◽

Response To Neoadjuvant Chemotherapy ◽

Machine Learning Models

Download Full-text

Breast cancer survival prediction using seven prognostic biomarker genes

Oncology Letters ◽

10.3892/ol.2019.10635 ◽

2019 ◽

Cited By ~ 1

Author(s):

Liu Liu ◽

Zhilin Chen ◽

Wenjie Shi ◽

Hui Liu ◽

Weiyi Pang

Keyword(s):

Breast Cancer ◽

Cancer Survival ◽

Prognostic Biomarker ◽

Breast Cancer Survival ◽

Survival Prediction

Download Full-text

Comparative Study of Different Machine Learning Models for Breast Cancer Diagnosis

Innovations in Soft Computing and Information Technology ◽

10.1007/978-981-13-3185-5_3 ◽

2019 ◽

pp. 17-25 ◽

Cited By ~ 2

Author(s):

Aman Kumar ◽

M. Poonkodi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Comparative Study ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Efficient Breast Cancer Prediction Using Ensemble Machine Learning Models

2019 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT) ◽

10.1109/rteict46194.2019.9016968 ◽

2019 ◽

Cited By ~ 1

Author(s):

Naveen ◽

R. K. Sharma ◽

Anil Ramachandran Nair

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Learning Models ◽

Cancer Prediction ◽

Ensemble Machine Learning ◽

Machine Learning Models

Download Full-text

Potential Breast Cancer Drug Prediction using Machine Learning Models

2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE) ◽

10.1109/ic-etite47903.2020.288 ◽

2020 ◽

Author(s):

N. Priya ◽

G. Shobana

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cancer Drug ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Machine Learning Models That Integrate Tumor Texture and Perfusion Characteristics Using Low-Dose Breast Computed Tomography Are Promising for Predicting Histological Biomarkers and Treatment Failure in Breast Cancer Patients

Cancers ◽

10.3390/cancers13236013 ◽

2021 ◽

Vol 13 (23) ◽

pp. 6013

Author(s):

Hyun-Soo Park ◽

Kwang-sig Lee ◽

Bo-Kyoung Seo ◽

Eun-Sil Kim ◽

Kyu-Ran Cho ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Random Forest ◽

Cancer Patients ◽

Low Dose ◽

Random Forest Model ◽

Breast Cancer Patients ◽

Learning Models ◽

Forest Model ◽

Machine Learning Models

This prospective study enrolled 147 women with invasive breast cancer who underwent low-dose breast CT (80 kVp, 25 mAs, 1.01–1.38 mSv) before treatment. From each tumor, we extracted eight perfusion parameters using the maximum slope algorithm and 36 texture parameters using the filtered histogram technique. Relationships between CT parameters and histological factors were analyzed using five machine learning algorithms. Performance was compared using the area under the receiver-operating characteristic curve (AUC) with the DeLong test. The AUCs of the machine learning models increased when using both features instead of the perfusion or texture features alone. The random forest model that integrated texture and perfusion features was the best model for prediction (AUC = 0.76). In the integrated random forest model, the AUCs for predicting human epidermal growth factor receptor 2 positivity, estrogen receptor positivity, progesterone receptor positivity, ki67 positivity, high tumor grade, and molecular subtype were 0.86, 0.76, 0.69, 0.65, 0.75, and 0.79, respectively. Entropy of pre- and postcontrast images and perfusion, time to peak, and peak enhancement intensity of hot spots are the five most important CT parameters for prediction. In conclusion, machine learning using texture and perfusion characteristics of breast cancer with low-dose CT has potential value for predicting prognostic factors and risk stratification in breast cancer patients.

Download Full-text

Value of the Application of CE-MRI Radiomics and Machine Learning in Preoperative Prediction of Sentinel Lymph Node Metastasis in Breast Cancer

Frontiers in Oncology ◽

10.3389/fonc.2021.757111 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yadi Zhu ◽

Ling Yang ◽

Hailin Shen

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Lymph Node ◽

Sentinel Lymph Node ◽

Breast Cancer Patients ◽

Learning Models ◽

Preoperative Prediction ◽

Machine Learning Model ◽

Validation Set ◽

Machine Learning Models

PurposeTo explore the value of machine learning model based on CE-MRI radiomic features in preoperative prediction of sentinel lymph node (SLN) metastasis of breast cancer.MethodsThe clinical, pathological and MRI data of 177 patients with pathologically confirmed breast cancer (81 with SLN positive and 96 with SLN negative) and underwent conventional DCE-MRI before surgery in the First Affiliated Hospital of Soochow University from January 2015 to May 2021 were analyzed retrospectively. The samples were randomly divided into the training set (n=123) and validation set (n= 54) according to the ratio of 7:3. The radiomic features were derived from DCE-MRI phase 2 images, and 1,316 original eigenvectors are normalized by maximum and minimum normalization. The optimal feature filter and selection operator (LASSO) algorithm were used to obtain the optimal features. Five machine learning models of Support Vector Machine, Random Forest, Logistic Regression, Gradient Boosting Decision Tree, and Decision Tree were constructed based on the selected features. Radiomics signature and independent risk factors were incorporated to build a combined model. The receiver operating characteristic curve and area under the curve were used to evaluate the performance of the above models, and the accuracy, sensitivity, and specificity were calculated.ResultsThere is no significant difference between all clinical and histopathological variables in breast cancer patients with and without SLN metastasis (P >0.05), except tumor size and BI-RADS classification (P< 0.01). Thirteen features were obtained as optimal features for machine learning model construction. In the validation set, the AUC (0.86) of SVM was the highest among the five machine learning models. Meanwhile, the combined model showed better performance in sentinel lymph node metastasis (SLNM) prediction and achieved a higher AUC (0.88) in the validation set.ConclusionsWe revealed the clinical value of machine learning models established based on CE-MRI radiomic features, providing a highly accurate, non-invasive, and convenient method for preoperative prediction of SLNM in breast cancer patients.

Download Full-text

Simple Linear Cancer Risk Prediction Models with Novel Features Outperform Complex Approaches

10.1101/2021.01.11.21249290 ◽

2021 ◽

Author(s):

Scott Kulm ◽

Lior Kofman ◽

Jason Mezey ◽

Olivier Elemento

Keyword(s):

Machine Learning ◽

Cancer Survival ◽

Linear Models ◽

Prediction Models ◽

Learning Algorithm ◽

Learning Performance ◽

Health Study ◽

Learning Models ◽

The Uk ◽

Machine Learning Models

ABSTRACTA patient’s risk for cancer is usually estimated through simple linear models that sum effect sizes of proven risk factors. In theory, more advanced machine learning models can be used for the same task. Using data from the UK Biobank, a large prospective health study, we have developed linear and machine learning models for the prediction of 12 different cancers diagnoses within a 10 year time span. We find that the top machine learning algorithm, XGBoost (XGB), trained on 707 features generated an average area under the receiver operator curve of 0.736 (with a range of 0.65-0.85). Linear models trained with only 10 features were found to be statistically indifferent from the machine learning performance. The linear models were significantly more accurate than the prominent QCancer models (p = 0.0019), which are trained on 45 million patient records and available to over 4,000 United Kingdom general practices. The increase in accuracy may be caused by the consideration of often omitted feature types, including survey answers, census records, and genetic information. This approach led to the discovery of significant novel risk features, including self-reported happiness with own health (relevant to 12 cancers), measured testosterone (relevant to 8 cancers), and ICD codes for rehabilitation procedures (relevant to 3 cancers). These ten feature models can be easily implemented within the clinic, allowing for personalized screening schedules that may increase the cancer survival within a population.

Download Full-text