Background
Health systems worldwide face rising costs driven by aging populations, chronic diseases, and expensive technologies. Overspending strains budgets and can limit access to essential services. In China, expanded national insurance coverage has added pressure on hospitals to manage budgets effectively. As the authors put it, “delayed feedback on overspending—an issue never before addressed internationally—has significantly hindered hospital departmental managers’ ability to make timely, informed adjustments.”
What the study did
Researchers analyzed 549,910 discharged patient records from Wuxi (Jan 2022–Nov 2023) and built two department-level datasets: monthly (regional) and daily (hospital). After removing missing data, 506,917 cases remained, yielding 8,416 region-month records and 44,017 hospital-day records. Departments were grouped into seven categories, and statistical process control (SPC) was used to label each department-period as no, low, or high overspending risk. Four modeling approaches were compared: logistic regression, random forest, LightGBM, and artificial neural networks.
Key findings (what’s new)
-
Novel task: Machine learning had not previously been used to predict overspending risk at regional and hospital department levels.
-
Top performer: LightGBM achieved 0.82 accuracy for both regional and hospital predictions. For high-risk departments, AUC-ROC reached 0.91 (regional) and 0.97 (hospital). Precision–recall was strong for the high-risk class (PR-AUC 0.93 regional; 0.90 hospital).
-
Drivers of risk: Four departmental performance indicators (DPIs) were most associated with overspending: Total Amount of Discharged Patients (TADP), Average Inpatient Stay (AIS), Medicine Expenses Percentage (MEP), and Consumables Expenses Percentage (CEP). Higher values of these indicators aligned with higher risk.
How the model is used in practice
The team integrated the best model into a Health Insurance Management System (HIMS). Managers can:
-
Identify risk by selecting hospitals/departments to view current monthly or daily risk;
-
Analyze risk via visual explanations that show how each DPI contributed;
-
Intervene by adjusting operational levers (e.g., bed scheduling, diagnostic capacity, procurement policies).
Measured impact
Following integration, the hospital reported year-over-year reductions under consistent departmental conditions: –6.28% per-capita medical costs, –12.18% per-capita drug costs, and –14.1% per-capita consumables costs (2023–2024).
Why this matters (for researchers and leaders)
For researchers:
-
Demonstrates that simple, interpretable DPIs extracted from routine records can power accurate multi-class risk prediction across departments.
-
Shows value of SPC-based labeling and SHAP explanations to keep models transparent and actionable.
For hospital and regional leaders:
-
Offers early warnings where overspending risk is concentrated.
-
Links predictions to operational decisions—for example, optimizing inpatient flow (AIS), reviewing drug formularies (MEP), or tightening consumables management (CEP).
-
Embedding into HIMS aligns analytics with daily management routines.
Study details at a glance
-
Setting & data: Wuxi, China; 549k discharges; department-level monthly (region) and cumulative daily (hospital) datasets.
-
Labels: No/low/high risk via SPC thresholds by department group.
-
Models tested: LR, RF, LightGBM, ANN; LightGBM selected.
-
Explanations: SHAP used to rank DPI contributions; TADP, AIS, MEP, CEP consistently most influential.
Limitations and next steps
The authors note that hyperparameters may need local tuning when models are transferred to new hospitals or cities. Current DPIs mainly capture care quality and operational efficiency; future work could add socioeconomic factors and explore federated learning for broader generalizability.
Take-home message
A department-focused, machine-learning approach—built on routine indicators and embedded in hospital systems—can predict health insurance overspending risks with strong accuracy and support timely, targeted interventions.
The translation of the preceding English text in Chinese:
背景
全球医疗体系正面临因人口老龄化、慢性疾病和高昂技术带来的成本上升。超支会加重财政压力,并可能限制对基本服务的获取。在中国,国家医保覆盖面的扩大增加了医院在预算管理方面的压力。正如作者所言:“对超支的反馈延迟——这一问题在国际上从未被探讨过——显著阻碍了医院科室管理者及时、合理调整的能力。”
研究内容
研究人员分析了无锡 549,910 份出院病历(2022 年 1 月–2023 年 11 月),并构建了两个科室级数据集:月度(区域)和每日(医院)。在剔除缺失数据后,剩余 506,917 例,最终得到 8,416 条区域-月份记录和 44,017 条医院-日期记录。各科室分为七类,并使用统计过程控制(SPC)将每个科室-时间段标记为“无风险、低风险或高风险”。比较了四种建模方法:逻辑回归、随机森林、LightGBM 和人工神经网络。
主要发现(创新点)
-
新任务:此前从未使用机器学习预测区域和医院科室层面的超支风险。
-
最佳模型:LightGBM 在区域和医院预测中均达到 0.82 的准确率。在高风险科室预测中,AUC-ROC 分别达到 0.91(区域)和 0.97(医院)。在高风险类别中,精确率–召回率表现突出(PR-AUC:区域 0.93;医院 0.90)。
-
风险驱动因素:四个科室绩效指标(DPI)与超支最相关:出院患者总数(TADP)、平均住院日(AIS)、药品费用占比(MEP)、耗材费用占比(CEP)。这些指标数值越高,风险越大。
模型在实践中的应用
研究团队将最佳模型集成进医保管理系统(HIMS)。管理者可以:
-
识别风险:选择医院/科室查看当月或当日的风险水平;
-
分析风险:通过可视化解释了解各 DPI 的贡献;
-
干预措施:调整运营杠杆(如床位调度、诊断能力、采购政策)。
实际效果
系统集成后,在科室条件保持一致的情况下,医院报告了逐年下降的成本(2023–2024 年):人均医疗费用下降 6.28%,人均药品费用下降 12.18%,人均耗材费用下降 14.1%。
意义(对研究者和管理者)
-
对研究者:
-
证明了可从常规病案中提取的简单、可解释的 DPI 能够支持跨科室的多分类风险预测。
-
展示了基于 SPC 的标签和 SHAP 解释的价值,使模型保持透明且可操作。
-
-
对医院和区域管理者:
-
提供了超支风险集中点的早期预警。
-
将预测结果与运营决策相结合,例如优化住院流程(AIS)、审查药品目录(MEP)、强化耗材管理(CEP)。
-
融入 HIMS,使分析与日常管理相结合。
-
研究概要
-
场景与数据:无锡,中国;54.9 万份出院记录;科室级月度(区域)和每日(医院)数据集。
-
标签:基于 SPC 阈值,将科室分组后标记为无/低/高风险。
-
模型:LR、RF、LightGBM、ANN;最终选用 LightGBM。
-
解释:利用 SHAP 排序 DPI 贡献;TADP、AIS、MEP、CEP 一直最具影响力。
局限性与后续工作
作者指出,当模型应用到新医院或新城市时,超参数可能需要本地化调整。目前的 DPI 主要反映医疗质量和运营效率;未来工作可加入社会经济因素,并探索联邦学习以提升泛化能力。
核心结论
一种以科室为中心的机器学习方法——基于常规指标并嵌入医院系统——能够高精度预测医保超支风险,并支持及时、针对性的干预。
Reference:
Yao Bu, Danqi Wang, Xiaomao Fan, Jiongying Li, Lei Hua, Lin Zhang, Wenjun Ma, Liwen He, Hao Zang, Haijun Zhang, Xingyu Liu, Yufeng Gao, Li Liu
Enhancing predictions of health insurance overspending risk through hospital departmental performance indicators.
Biomol Biomed [Internet]. 2025 Jun. 28 [cited 2025 Sep. 25];25(10):2269–2280.
Available from: https://www.bjbms.org/ojs/index.php/bjbms/article/view/12051
Additional information:
We invite submissions for our upcoming thematic issues, including:
- Immune Prediction and Prognostic Biomarkers in Immuno-Oncology
- Artificial Intelligence and Machine Learning in disease diagnosis and treatment target identification
More news: Blog
Editor: Merima Hadžić
Leave a Reply