scholarly journals Optimizing Outcome Prediction in Diffuse Large B-Cell Lymphoma by Use of Machine Learning and Nationwide Lymphoma Registries: A Nordic Lymphoma Group Study

2018 ◽  
pp. 1-13 ◽  
Author(s):  
Jorne L. Biccler ◽  
Sandra Eloranta ◽  
Peter de Nully Brown ◽  
Henrik Frederiksen ◽  
Mats Jerkeman ◽  
...  

Purpose Prognostic models for diffuse large B-cell lymphoma (DLBCL), such as the International Prognostic Index (IPI) are widely used in clinical practice. The models are typically developed with simplicity in mind and thus do not exploit the full potential of detailed clinical data. This study investigated whether nationwide lymphoma registries containing clinical data and machine learning techniques could prove to be useful for building modern prognostic tools. Patients and Methods This study was based on nationwide lymphoma registries from Denmark and Sweden, which include large amounts of clinicopathologic data. Using the Danish DLBCL cohort, a stacking approach was used to build a new prognostic model that leverages the strengths of different survival models. To compare the performance of the stacking approach with established prognostic models, cross-validation was used to estimate the concordance index (C-index), time-varying area under the curve, and integrated Brier score. Finally, the generalizability was tested by applying the new model to the Swedish cohort. Results In total, 2,759 and 2,414 patients were included from the Danish and Swedish cohorts, respectively. In the Danish cohort, the stacking approach led to the lowest integrated Brier score, indicating that the survival curves obtained from the stacking model fitted the observed survival the best. The C-index and time-varying area under the curve indicated that the stacked model (C-index: Denmark [DK], 0.756; Sweden [SE], 0.744) had good discriminative capabilities compared with the other considered prognostic models (IPI: DK, 0.662; SE, 0.661; and National Comprehensive Cancer Network–IPI: DK, 0.681; SE, 0.681). Furthermore, these results were reproducible in the independent Swedish cohort. Conclusion A new prognostic model based on machine learning techniques was developed and was shown to significantly outperform established prognostic indices for DLBCL. The model is available at https://lymphomapredictor.org .

2020 ◽  
Author(s):  
Georgios Kantidakis ◽  
Hein Putter ◽  
Carlo Lancia ◽  
Jacob de Boer ◽  
Andries E Braat ◽  
...  

Abstract Background: Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians.Methods: In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques.Results: Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years.Conclusion: In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Georgios Kantidakis ◽  
Hein Putter ◽  
Carlo Lancia ◽  
Jacob de Boer ◽  
Andries E. Braat ◽  
...  

Abstract Background Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians. Methods In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques. Results Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years. Conclusion In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables. Trial registration Retrospective data were provided by the Scientific Registry of Transplant Recipients under Data Use Agreement number 9477 for analysis of risk factors after liver transplantation.


2020 ◽  
Vol 29 (4) ◽  
pp. e70-e80
Author(s):  
Mireia Ladios-Martin ◽  
José Fernández-de-Maya ◽  
Francisco-Javier Ballesta-López ◽  
Adrián Belso-Garzas ◽  
Manuel Mas-Asencio ◽  
...  

Background Pressure injuries are an important problem in hospital care. Detecting the population at risk for pressure injuries is the first step in any preventive strategy. Available tools such as the Norton and Braden scales do not take into account all of the relevant risk factors. Data mining and machine learning techniques have the potential to overcome this limitation. Objectives To build a model to detect pressure injury risk in intensive care unit patients and to put the model into production in a real environment. Methods The sample comprised adult patients admitted to an intensive care unit (N = 6694) at University Hospital of Torrevieja and University Hospital of Vinalopó. A retrospective design was used to train (n = 2508) and test (n = 1769) the model and then a prospective design was used to test the model in a real environment (n = 2417). Data mining was used to extract variables from electronic medical records and a predictive model was built with machine learning techniques. The sensitivity, specificity, area under the curve, and accuracy of the model were evaluated. Results The final model used logistic regression and incorporated 23 variables. The model had sensitivity of 0.90, specificity of 0.74, and area under the curve of 0.89 during the initial test, and thus it outperformed the Norton scale. The model performed well 1 year later in a real environment. Conclusions The model effectively predicts risk of pressure injury. This allows nurses to focus on patients at high risk for pressure injury without increasing workload.


2020 ◽  
Author(s):  
Georgios Kantidakis ◽  
Hein Putter ◽  
Carlo Lancia ◽  
Jacob de Boer ◽  
Andries E Braat ◽  
...  

Abstract Background: Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest. Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians. Methods: In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques. Results: Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years. Conclusion: In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables.


2021 ◽  
Vol 3 ◽  
Author(s):  
Ahmed Al-Hindawi ◽  
Ahmed Abdulaal ◽  
Timothy M. Rawson ◽  
Saleh A. Alqahtani ◽  
Nabeela Mughal ◽  
...  

The SARS-CoV-2 virus, which causes the COVID-19 pandemic, has had an unprecedented impact on healthcare requiring multidisciplinary innovation and novel thinking to minimize impact and improve outcomes. Wide-ranging disciplines have collaborated including diverse clinicians (radiology, microbiology, and critical care), who are working increasingly closely with data-science. This has been leveraged through the democratization of data-science with the increasing availability of easy to access open datasets, tutorials, programming languages, and hardware which makes it significantly easier to create mathematical models. To address the COVID-19 pandemic, such data-science has enabled modeling of the impact of the virus on the population and individuals for diagnostic, prognostic, and epidemiological ends. This has led to two large systematic reviews on this topic that have highlighted the two different ways in which this feat has been attempted: one using classical statistics and the other using more novel machine learning techniques. In this review, we debate the relative strengths and weaknesses of each method toward the specific task of predicting COVID-19 outcomes.


2021 ◽  
Author(s):  
Viviane Costa Silva ◽  
Mateus Silva Rocha ◽  
Glaucia Amorim Faria ◽  
Silvio Fernando Alves Xavier Junior ◽  
Tiago Almeida de Oliveira ◽  
...  

Abstract The Agriculture sector has created and collected large amounts of data. It can be gathered, stored, and analyzed to assist in decision making generating competitive value, and the use of Machine Learning techniques has been very effective for this market. In this work, a Machine Learning study was carried out using supervised classification models based on boosting to predict disease in a crop, thus identifying the model with the best areas under curve metrics. Light Gradient Boosting Machine, CatBoost Classifier, Extreme Gradient, Gradient Boosting Classifier, Adaboost models were used to qualify the crop as healthy or sick. One can see that the LightGBM algorithm provided a better fit to the data with an area under the curve of 0.76 under the use of BORUTA variable selection.


2020 ◽  
Author(s):  
Georgios Kantidakis ◽  
Hein Putter ◽  
Carlo Lancia ◽  
Jacob de Boer ◽  
Andries E Braat ◽  
...  

Abstract Background: Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest. Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians. Methods: In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques. Results: Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years. Conclusion: In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables.


2021 ◽  
Vol 103-B (12) ◽  
pp. 1754-1758
Author(s):  
Luke Farrow ◽  
Mingjun Zhong ◽  
George Patrick Ashcroft ◽  
Lesley Anderson ◽  
R. M. Dominic Meek

There is increasing popularity in the use of artificial intelligence and machine-learning techniques to provide diagnostic and prognostic models for various aspects of Trauma & Orthopaedic surgery. However, correct interpretation of these models is difficult for those without specific knowledge of computing or health data science methodology. Lack of current reporting standards leads to the potential for significant heterogeneity in the design and quality of published studies. We provide an overview of machine-learning techniques for the lay individual, including key terminology and best practice reporting guidelines. Cite this article: Bone Joint J 2021;103-B(12):1754–1758.


2021 ◽  
Vol 5 (2) ◽  
pp. 447-455
Author(s):  
Aminat Yusuf ◽  
Oyelola Akande

Despite the popularity and utility of most machine learning techniques, expert knowledge is required in guiding choices about the suitable technique and settings that are good for solving a specific problem. The lack of expert information renders the procedures vulnerable to poor parameter settings. Several of these machine learning techniques configurations are offered under default settings. However, since different classification problems required suitable machine learning techniques, selecting the appropriate technique and tuning its settings are vital works that will rightly improve predictions in terms of reliability and accuracy. This study aims to perform grid search parameters tuning on 5-selected machine learning techniques on hepatitis disease. Comparative performance is drawn side-by-side with the default settings. The experimental results of the five tuning techniques show that using the configurations suggested in our work yield predictions of a greatly sophisticated quality than choice under its default settings. The result proves that tuning parameters of Support Vector Machine via grid search yields the best accuracy outcomes of 90% and has a competitive performance relative towards criteria of precision, recall, accuracy and Area Under the Curve. Present combinations of parameter settings for each of the techniques by identifying ranges of values for each setting that give good Hepatitis disease outcomes


Sign in / Sign up

Export Citation Format

Share Document