Abstract
AIMS: To develop machine-learning algorithms for predicting the risk of a hospitalization or emergency department (ED) visit for opioid use disorder (OUD) (i.e. OUD acute events) in Pennsylvania Medicaid enrollees in the Opioid Use Disorder Centers of Excellence (COE) program and to evaluate the fairness of model performance across racial groups.
METHODS: We studied 20 983 United States Medicaid enrollees aged 18 years or older who had COE visits between April 2019 and March 2021. We applied multivariate logistic regression, least absolute shrinkage and selection operator models, random forests, and eXtreme Gradient Boosting (XGB), to predict OUD acute events following the initial COE visit. Our models included predictors at the system, patient, and regional levels. We assessed model performance using multiple metrics by racial groups. Individuals were divided into a low, medium and high-risk group based on predicted risk scores.
RESULTS: The training (n = 13 990) and testing (n = 6993) samples displayed similar characteristics (mean age 38.1 ± 9.3 years, 58% male, 80% White enrollees) with 4% experiencing OUD acute events at baseline. XGB demonstrated the best prediction performance (C-statistic = 76.6% [95% confidence interval = 75.6%-77.7%] vs. 72.8%-74.7% for other methods). At the balanced cutoff, XGB achieved a sensitivity of 68.2%, specificity of 70.0%, and positive predictive value of 8.3%. The XGB model classified the testing sample into high-risk (6%), medium-risk (30%), and low-risk (63%) groups. In the high-risk group, 40.7% had OUD acute events vs. 16.5% and 5.0% in the medium- and low-risk groups. The high- and medium-risk groups captured 44% and 26% of individuals with OUD events. The XGB model exhibited lower false negative rates and higher false positive rates in racial/ethnic minority groups than White enrollees.
CONCLUSIONS: New machine-learning algorithms perform well to predict risks of opioid use disorder (OUD) acute care use among United States Medicaid enrollees and improve fairness of prediction across racial and ethnic groups compared with previous OUD-related models.