scholarly journals Population Health Management Exploiting Machine Learning Algorithms to Identify High-Risk Patients

Author(s):  
Silvia Panicacci ◽  
Massimiliano Donati ◽  
Luca Fanucci ◽  
Irene Bellin ◽  
Francesco Profili ◽  
...  
2021 ◽  
Author(s):  
Fang He ◽  
John H Page ◽  
Kerry R Weinberg ◽  
Anirban Mishra

BACKGROUND The current COVID-19 pandemic is unprecedented; under resource-constrained setting, predictive algorithms can help to stratify disease severity, alerting physicians of high-risk patients, however there are few risk scores derived from a substantially large EHR dataset, using simplified predictors as input. OBJECTIVE To develop and validate simplified machine learning algorithms which predicts COVID-19 adverse outcomes, to evaluate the AUC (area under the receiver operating characteristic curve), sensitivity, specificity and calibration of the algorithms, to derive clinically meaningful thresholds. METHODS We conducted machine learning model development and validation via cohort study using multi-center, patient-level, longitudinal electronic health records (EHR) from Optum® COVID-19 database which provides anonymized, longitudinal EHR from across US. The models were developed based on clinical characteristics to predict 28-day in-hospital mortality, ICU admission, respiratory failure, mechanical ventilator usages at inpatient setting. Data from patients who were admitted prior to Sep 7, 2020, is randomly sampled into development, test and validation datasets; data collected from Sep 7, 2020 through Nov 15, 2020 was reserved as prospective validation dataset. RESULTS Of 3.7M patients in the analysis, a total of 585,867 patients were diagnosed or tested positive for SARS-CoV-2; and 50,703 adult patients were hospitalized with COVID-19 between Feb 1 and Nov 15, 2020. Among the study cohort (N=50,703), there were 6,204 deaths, 9,564 ICU admissions, 6,478 mechanically ventilated or EMCO patients and 25,169 patients developed ARDS or respiratory failure within 28 days since hospital admission. The algorithms demonstrated high accuracy (AUC = 0.89 (0.89 - 0.89) on validation dataset (N=10,752)), consistent prediction through the second wave of pandemic from September to November (AUC = 0.85 (0.85 - 0.86) on post-development validation (N= 14,863)), great clinical relevance and utility. Besides, a comprehensive 386 input covariates from baseline and at admission was included in the analysis; the end-to-end pipeline automates feature selection and model development process, producing 10 key predictors as input such as age, blood urea nitrogen, oxygen saturation, which are both commonly measured and concordant with recognized risk factors for COVID-19. CONCLUSIONS The systematic approach and rigorous validations demonstrate consistent model performance to predict even beyond the time period of data collection, with satisfactory discriminatory power and great clinical utility. Overall, the study offers an accurate, validated and reliable prediction model based on only ten clinical features as a prognostic tool to stratifying COVID-19 patients into intermediate, high and very high-risk groups. This simple predictive tool could be shared with a wider healthcare community, to enable service as an early warning system to alert physicians of possible high-risk patients, or as a resource triaging tool to optimize healthcare resources. CLINICALTRIAL N/A


Computers ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 4
Author(s):  
Silvia Panicacci ◽  
Massimiliano Donati ◽  
Francesco Profili ◽  
Paolo Francesconi ◽  
Luca Fanucci

Together with population ageing, the number of people suffering from multimorbidity is increasing, up to more than half of the population by 2035. This part of the population is composed by the highest-risk patients, who are, at the same time, the major users of the healthcare systems. The early identification of this sub-population can really help to improve people’s quality of life and reduce healthcare costs. In this paper, we describe a population health management tool based on state-of-the-art intelligent algorithms, starting from administrative and socio-economic data, for the early identification of high-risk patients. The study refers to the population of the Local Health Unit of Central Tuscany in 2015, which amounts to 1,670,129 residents. After a trade-off on machine learning models and on input data, Random Forest applied to 1-year of historical data achieves the best results, outperforming state-of-the-art models. The most important variables for this model, in terms of mean minimal depth, accuracy decrease and Gini decrease, result to be age and some group of drugs, such as high-ceiling diuretics. Thanks to the low inference time and reduced memory usage, the resulting model allows for real-time risk prediction updates whenever new data become available, giving General Practitioners the possibility to early adopt personalised medicine.


2017 ◽  
Author(s):  
◽  
Lincoln Sheets

Risk analysis and population health management can improve health outcomes, but improved risk stratification is needed to manage healthcare costs. Analysis of 157 publications on translational implementations of "risk stratification in population health management of chronic disease" showed a consensus that population health management and risk stratification can improve outcomes, but found uncertainty over best methods for risk prediction and controversy over the cost savings. The consensus of another 85 publications on the methodologies of "data mining for predictive healthcare analytics" was that clinically interpretable machine learning techniques are more appropriate than "black box" techniques for structured big data sources in healthcare, and the "area under the curve" of a prediction model's sensitivity versus one-minus-specificity is a standard and reliable way to measure the model's discrimination. This study used clinically interpretable machine-learning algorithms, combined with simple but powerful data analytic techniques such as cost analysis and data visualization, to evaluate and improve risk stratification for a managed patient population. This study retrospectively observed 10,000 mid-Missouri Medicare and Medicaid patients between 2012 and 2014. Cost and utilization analyses, statistical clustering, contrast mining, and logistic regression were used to identify patients within a managed population at risk for higher healthcare costs, demonstrate longitudinal changes in risk stratification, and characterize detailed differences between high-risk and low-risk patients. The two highest risk stratification tiers comprised only 21% of patients but accounted for 43% of prospective charges. Patients in the most expensive sub-cluster of the most expensive risk tier were nearly twice as costly as high-risk patients on average. Combining contrast mining with logistic regression predicted the most expensive 5% of patients with 84% accuracy, as measured by area under the curve. All the strategies used in this study, from the simplest to the most sophisticated, produced useful insights. By predicting the small number of patients who will incur the majority of healthcare expenses in terms that are clinically interpretable, these methods can support population health managers in focusing preventive and longitudinal care more effectively. These models, and similar models developed by integrating diverse informatics strategies, could improve health outcomes, delivery, and costs.


2021 ◽  
Vol 22 (3) ◽  
pp. 1075
Author(s):  
Luca Bedon ◽  
Michele Dal Bo ◽  
Monica Mossenta ◽  
Davide Busato ◽  
Giuseppe Toffoli ◽  
...  

Although extensive advancements have been made in treatment against hepatocellular carcinoma (HCC), the prognosis of HCC patients remains unsatisfied. It is now clearly established that extensive epigenetic changes act as a driver in human tumors. This study exploits HCC epigenetic deregulation to define a novel prognostic model for monitoring the progression of HCC. We analyzed the genome-wide DNA methylation profile of 374 primary tumor specimens using the Illumina 450 K array data from The Cancer Genome Atlas. We initially used a novel combination of Machine Learning algorithms (Recursive Features Selection, Boruta) to capture early tumor progression features. The subsets of probes obtained were used to train and validate Random Forest models to predict a Progression Free Survival greater or less than 6 months. The model based on 34 epigenetic probes showed the best performance, scoring 0.80 accuracy and 0.51 Matthews Correlation Coefficient on testset. Then, we generated and validated a progression signature based on 4 methylation probes capable of stratifying HCC patients at high and low risk of progression. Survival analysis showed that high risk patients are characterized by a poorer progression free survival compared to low risk patients. Moreover, decision curve analysis confirmed the strength of this predictive tool over conventional clinical parameters. Functional enrichment analysis highlighted that high risk patients differentiated themselves by the upregulation of proliferative pathways. Ultimately, we propose the oncogenic MCM2 gene as a methylation-driven gene of which the representative epigenetic markers could serve both as predictive and prognostic markers. Briefly, our work provides several potential HCC progression epigenetic biomarkers as well as a new signature that may enhance patients surveillance and advances in personalized treatment.


2021 ◽  
Vol 12 (02) ◽  
pp. 372-382
Author(s):  
Christine Xia Wu ◽  
Ernest Suresh ◽  
Francis Wei Loong Phng ◽  
Kai Pik Tai ◽  
Janthorn Pakdeethai ◽  
...  

Abstract Objective To develop a risk score for the real-time prediction of readmissions for patients using patient specific information captured in electronic medical records (EMR) in Singapore to enable the prospective identification of high-risk patients for enrolment in timely interventions. Methods Machine-learning models were built to estimate the probability of a patient being readmitted within 30 days of discharge. EMR of 25,472 patients discharged from the medicine department at Ng Teng Fong General Hospital between January 2016 and December 2016 were extracted retrospectively for training and internal validation of the models. We developed and implemented a real-time 30-day readmission risk score generation in the EMR system, which enabled the flagging of high-risk patients to care providers in the hospital. Based on the daily high-risk patient list, the various interfaces and flow sheets in the EMR were configured according to the information needs of the various stakeholders such as the inpatient medical, nursing, case management, emergency department, and postdischarge care teams. Results Overall, the machine-learning models achieved good performance with area under the receiver operating characteristic ranging from 0.77 to 0.81. The models were used to proactively identify and attend to patients who are at risk of readmission before an actual readmission occurs. This approach successfully reduced the 30-day readmission rate for patients admitted to the medicine department from 11.7% in 2017 to 10.1% in 2019 (p < 0.01) after risk adjustment. Conclusion Machine-learning models can be deployed in the EMR system to provide real-time forecasts for a more comprehensive outlook in the aspects of decision-making and care provision.


2019 ◽  
Vol 28 (10) ◽  
pp. 835-842 ◽  
Author(s):  
Shirley V Wang ◽  
James R Rogers ◽  
Yinzhu Jin ◽  
David DeiCicchi ◽  
Sara Dejene ◽  
...  

BackgroundClinical guidelines recommend anticoagulation for patients with atrial fibrillation (AF) at high risk of stroke; however, studies report 40% of this population is not anticoagulated.ObjectiveTo evaluate a population health intervention to increase anticoagulation use in high-risk patients with AF.MethodsWe used machine learning algorithms to identify patients with AF from electronic health records at high risk of stroke (CHA2DS2-VASc risk score ≥2), and no anticoagulant prescriptions within 12 months. A clinical pharmacist in the anticoagulation service reviewed charts for algorithm-identified patients to assess appropriateness of initiating an anticoagulant. The pharmacist then contacted primary care providers of potentially undertreated patients and offered assistance with anticoagulation management. We used a stepped-wedge design, evaluating the proportion of potentially undertreated patients with AF started on anticoagulant therapy within 28 days for clinics randomised to intervention versus usual care.ResultsOf 1727 algorithm-identified high-risk patients with AF in clinics at the time of randomisation to intervention, 432 (25%) lacked evidence of anticoagulant prescriptions in the prior year. After pharmacist review, only 17% (75 of 432) of algorithm-identified patients were considered potentially undertreated at the time their clinic was randomised to intervention. Over a third (155 of 432) were excluded because they had a single prior AF episode (transient or provoked by serious illness); 36 (8%) had documented refusal of anticoagulation, the remainder had other reasons for exclusion. The intervention did not increase new anticoagulant prescriptions (intervention: 4.1% vs usual care: 4.0%, p=0.86).ConclusionsAlgorithms to identify underuse of anticoagulation among patients with AF in healthcare databases may not capture clinical subtleties or patient preferences and may overestimate the extent of undertreatment. Changing clinician behaviour remains challenging.


2020 ◽  
Vol 23 (4) ◽  
pp. 278-285
Author(s):  
Yhenneko J. Taylor ◽  
Jason Roberge ◽  
Whitney Rossman ◽  
Jennifer Jones ◽  
Colleen Generoso ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document