Informatics strategies for risk stratification in population health management
Risk analysis and population health management can improve health outcomes, but improved risk stratification is needed to manage healthcare costs. Analysis of 157 publications on translational implementations of "risk stratification in population health management of chronic disease" showed a consensus that population health management and risk stratification can improve outcomes, but found uncertainty over best methods for risk prediction and controversy over the cost savings. The consensus of another 85 publications on the methodologies of "data mining for predictive healthcare analytics" was that clinically interpretable machine learning techniques are more appropriate than "black box" techniques for structured big data sources in healthcare, and the "area under the curve" of a prediction model's sensitivity versus one-minus-specificity is a standard and reliable way to measure the model's discrimination. This study used clinically interpretable machine-learning algorithms, combined with simple but powerful data analytic techniques such as cost analysis and data visualization, to evaluate and improve risk stratification for a managed patient population. This study retrospectively observed 10,000 mid-Missouri Medicare and Medicaid patients between 2012 and 2014. Cost and utilization analyses, statistical clustering, contrast mining, and logistic regression were used to identify patients within a managed population at risk for higher healthcare costs, demonstrate longitudinal changes in risk stratification, and characterize detailed differences between high-risk and low-risk patients. The two highest risk stratification tiers comprised only 21% of patients but accounted for 43% of prospective charges. Patients in the most expensive sub-cluster of the most expensive risk tier were nearly twice as costly as high-risk patients on average. Combining contrast mining with logistic regression predicted the most expensive 5% of patients with 84% accuracy, as measured by area under the curve. All the strategies used in this study, from the simplest to the most sophisticated, produced useful insights. By predicting the small number of patients who will incur the majority of healthcare expenses in terms that are clinically interpretable, these methods can support population health managers in focusing preventive and longitudinal care more effectively. These models, and similar models developed by integrating diverse informatics strategies, could improve health outcomes, delivery, and costs.