Machine‐learning algorithms for predicting hospital re‐admissions in sickle cell disease

2020 ◽  
Vol 192 (1) ◽  
pp. 158-170
Author(s):  
Arisha Patel ◽  
Kyra Gan ◽  
Andrew A. Li ◽  
Jeremy Weiss ◽  
Mehdi Nouraie ◽  
...  
2019 ◽  
Author(s):  
Akram Mohammed ◽  
Pradeep S. B. Podila ◽  
Robert L. Davis ◽  
Kenneth I. Ataga ◽  
Jane S. Hankins ◽  
...  

AbstractBackgroundSickle cell disease (SCD) is a genetic disorder of the red blood cells, resulting in multiple acute and chronic complications including pain episodes, stroke, and kidney disease. Patients with SCD develop chronic organ dysfunction, which may progress to organ failure during disease exacerbations. Early detection of acute physiological deterioration leading to organ failure is not always attainable. Machine learning techniques that allow for prediction of organ failure may enable earlier identification and treatment, and potentially reduce mortality. We tested the hypothesis that machine learning physiomarkers could predict the development of organ dysfunction in an adult sample of patients with SCD admitted to intensive care units.Methods and FindingsWe studied 63 sequential SCD patients with 163 patient encounters, mean age 33.0±11.0 years, admitted to intensive care units, some of whom (6.7%) had pre-existing cardiovascular or kidney disease. A subset of these patient encounters (37; 23%) met sequential organ failure assessment (SOFA) criteria. The site of organ failure included: central nervous system (32), cardiovascular (11), renal (10), liver (7), respiratory (5) and coagulation (2) systems. Most (81.5%) of the patient encounters who experienced organ failure had single organ failure. The other 126 SCD patient encounters served as controls. A set of signal processing features (such as fast fourier transform, energy, continuous wavelet transform, etc.) derived from heart rate, blood pressure, and respiratory rate were identified to distinguish patients with SCD who developed acute physiological deterioration leading to organ failure, from SCD patients who did not meet the criteria. A random forest model accurately predicted organ failure up to six hours prior to onset, with a five-fold cross-validation accuracy of 94.57% (average sensitivity and specificity of 90.24% and 98.9% respectively).ConclusionsThis study demonstrates the viability of using machine learning to predict acute physiological deterioration heralded organ failure among hospitalized adults with SCD. The discovery of salient physiomarkers through machine learning techniques has the potential to further accelerate the development and implementation of innovative care delivery protocols and strategies for medically vulnerable patients.


Author(s):  
Eliseos J. Mucaki ◽  
Ben C. Shirley ◽  
Peter K. Rogan

AbstractPurposeCombinations of expressed genes can discriminate radiation-exposed from normal control blood samples by machine learning based signatures (with 8 to 20% misclassification rates). These signatures can quantify therapeutically-relevant as well as accidental radiation exposures. The prodromal symptoms of Acute Radiation Syndrome (ARS) overlap some viral infections. We recently showed that these human radiation signatures produced unexpected false positive misclassification of Influenza and Dengue infected samples. The present study investigates these and other confounders, and then mitigates their effects on signature accuracy.MethodsThis study investigated recall by previous and novel radiation signatures independently derived from multiple Gene Expression Omnibus datasets on common and rare non-malignant blood disorders and blood-borne infections (thromboembolism, S. aureus bacteremia, malaria, sickle cell disease, polycythemia vera, and aplastic anemia). Normalized expression levels of signature genes are used as input to machine learning-based classifiers to predict radiation exposure in other hematological conditions.ResultsExcept for aplastic anemia, these blood-borne disorders modify the normal baseline expression values of genes present in radiation signatures, leading to false-positive misclassification of radiation exposures in 8 to 54% of individuals. Shared changes, predominantly in DNA damage response and apoptosis-related gene transcripts in radiation and confounding hematological conditions, compromise the utility of these signatures for radiation assessment. These confounding conditions (sickle cell disease, thromboembolism, S. aureus bacteremia, malaria) induce neutrophil extracellular traps, initiated by chromatin decondensation, DNA damage response and fragmentation followed by programmed cell death. Riboviral infections (for example, Influenza, Dengue fever) are proposed to deplete RNA binding proteins, inducing R-loops in chromatin which collide with replication forks resulting in DNA damage, and apoptosis. To mitigate the effects of confounders, we evaluated predicted radiation positive samples with novel gene expression signatures derived from radiation-responsive transcripts encoding secreted blood plasma proteins whose expression levels are unperturbed by these conditions.ConclusionsThis approach identifies and eliminates misclassified samples with underlying hematological or infectious conditions, leaving only samples with true radiation exposures. Diagnostic accuracy is significantly improved by selecting genes that maximize both sensitivity and specificity in the appropriate tissue using combinations of the best signatures for each of these classes of signatures.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 893-893
Author(s):  
Vandana Sachdev ◽  
Yuan Gu ◽  
James Nichols ◽  
Wen Li ◽  
Stanislav Sidenko ◽  
...  

Sickle cell disease (SCD) is a clinical syndrome that encompasses several different genotypes, the 3 most common being homozygosity for the bS allele (HbSS), compound heterozygosity of HbS and HbC (HbSC), and compound heterozygosity of HbS and HbSb thalassemia (HbSb+ or HbSb0 thalassemia). Generally, patients with HbSS and HbSb0 thalassemia genotypes have the most severe clinical manifestations, while patients with HbSC and HbSβ+-thalassemia are thought to be less severe. Within each of these genotypic groups, however, there are also substantial phenotypic differences. This heterogeneity makes it difficult to quantify the severity of the disease process and to guide therapeutics. As more intensive, high risk and costly treatments such as hematopoietic stem cell transplant and gene therapy are developing, the ability to assess patients at highest risk of early mortality becomes increasingly important. Integrating varied clinical, laboratory, and imaging markers for personalized risk prediction has been difficult, however, newer machine learning methods for outcome prediction take a more agnostic approach than traditional statistical methods and can detect complex, non-linear relationships in the data. In this study, we sought to apply machine learning methods to a well-characterized cohort of SCD patients followed at the National Institutes of Health in order to identify clinically meaningful subgroups of patients at highest risk of mortality. Between 2006 and 2017, 601 patients (age 35±13 years, 51% female) underwent echocardiogram, standard laboratory markers and hemoglobin electrophoresis resulting in 61 candidate variables. Among these patients, 488 had HbSS, 12 HbSb0 thalassemia, 80 HbSC, 20 HbSb+ thalassemia. All-cause mortality was ascertained by proxy interview, through medical records, and through the CDC National Death Index. Average follow-up time was 5 years and 130 patients were deceased. A random survival forest (RSF) algorithm followed by nested model selection and AIC Cox regression analysis identified 13 predictors of mortality (estimated right ventricular systolic pressure, peak tricuspid regurgitant (TR) velocity, mitral E velocity, septal and posterior wall thickness, IVC diameter, right atrial area, BUN, alkaline phosphatase, N-terminal-pro brain natriuretic peptide (BNP), creatinine, potassium and bicarbonate). This model performed better than individual clinical and laboratory variables with a C-statistic of 0.822 (genotype 0.524, eGFR 0.624, NT-proBNP 0.686, TR velocity 0.703). K-means clustering grouped all patients into 3 main clusters with significant survival differences. Survival at 8 years for the entire group was 70%; for individual clusters, survival was 43% for cluster 1, 72% for cluster 2, and 88% for cluster 3 (Figure 1A). Since TR velocity is recognized as one of the most specific independent predictors of mortality, we compared our results with this parameter. There was a better stratification of mortality risk using the 7 strongest parameters from RSF compared with TR velocity alone (Figure 1B), particularly for longer term outcomes. In this cohort of 601 patients with SCD, machine learning methods were used to show the heterogeneity of this disorder and the ability to detect phenotypic clusters with different mortality profiles. Although there are many individual predictors of mortality, few methods other than assessment by an expert clinician can integrate all known variables in deeply phenotyped patients. RSF and cluster analysis was used in this cohort to analyze a large amount of data in order to identify seven variables that could stratify patients into groups with significantly different outcomes. The specificity of this approach was high (c-statistic 0.822) and better than that of individual markers of end-organ involvement. Disclosures No relevant conflicts of interest to declare.


2021 ◽  
Vol 9 ◽  
Author(s):  
Pritish Mondal ◽  
Vishal Midya ◽  
Arshjot Khokhar ◽  
Shyama Sathianathan ◽  
Erick Forno

Background: Gas exchange abnormalities in Sickle Cell Disease (SCD) may represent cardiopulmonary deterioration. Identifying predictors of these abnormalities in children with SCD (C-SCD) may help us understand disease progression and develop informed management decisions.Objectives: To identify pulmonary function tests (PFT) estimates and biomarkers of disease severity that are associated with and predict abnormal diffusing capacity (DLCO) in C-SCD.Methods: We obtained PFT data from 51 C-SCD (median age:12.4 years, male: female = 29:22) (115 observations) and 22 controls (median age:11.1 years, male: female = 8:14), formulated a rank list of DLCO predictors based on machine learning algorithms (XGBoost) or linear mixed-effect models, and compared estimated DLCO to the measured values. Finally, we evaluated the association between measured or estimated DLCO and clinical outcomes, including SCD crises, pulmonary hypertension, and nocturnal desaturation.Results: Hemoglobin-adjusted DLCO (%) and several PFT indices were diminished in C-SCD compared to controls. Both statistical approaches ranked FVC (%), neutrophils (%), and FEF25−75 (%) as the top three predictors of DLCO. XGBoost had superior performance compared to the linear model. Both measured and estimated DLCO demonstrated a significant association with SCD severity: higher DLCO, estimated by XGBoost, was associated with fewer SCD crises [beta = −0.084 (95%CI: −0.13, −0.033)] and lower TRJV [beta = −0.009 (−0.017, −0.001)], but not with nocturnal desaturation (p = 0.12).Conclusions: In this cohort of C-CSD, DLCO was associated with PFT estimates representing restrictive lung disease (FVC, TLC), airflow obstruction (FEF25−75, FEV1/FVC, R5), and inflammation (neutrophilia). We used these indices to estimate DLCO, and show association with disease outcomes, underscoring the prediction models' clinical relevance.


2020 ◽  
Vol 5 (1) ◽  
pp. 46-59
Author(s):  
Noorh H. Alharbi ◽  
◽  
Rana O. Bameer ◽  
Shahad S. Geddan ◽  
, Hajar M. Alharbi ◽  
...  

Sickle cell disease is a severe hereditary disease caused by an abnormality of the red blood cells. The current therapeutic decision-making process applied to sickle cell disease includes monitoring a patient’s symptoms and complications and then adjusting the treatment accordingly. This process is time-consuming, which might result in serious consequences for patients’ lives and could lead to irreversible disease complications. Artificial intelligence, specifically machine learning, is a powerful technique that has been used to support medical decisions. This paper aims to review the recently developed machine learning models designed to interpret medical data regarding sickle cell disease. To propose an intelligence model, the suggested framework has to be performed in the following sequence. First, the data is preprocessed by imputing missing values and balancing them. Then, suitable feature selection methods are applied, and different classifiers are trained and tested. Finally, the performing model with the highest predefined performance metric over all experiments conducted is nominated. Thus, the aim of developing such a model is to predict the severity of a patient’s case, to determine the clinical complications of the disease, and to suggest the correct dosage of the treatment(s).


Sign in / Sign up

Export Citation Format

Share Document