Page 121 - 《中国药房》2025年11期
P. 121
·智慧药学·
丙戊酸血药浓度预测的小样本多分类机器学习模型对比
Δ
1
3
2
1 #
1
1
2
1*
陈 曦 ,袁申奥 ,袁海玲 ,赵 杰 ,陈 鹏 ,田春艳 ,苏 怡 ,张云松 ,张 玉(1.西安国际医学中心医院
1
药学部,西安 710100;2.长安大学信息工程学院,西安 710064;3.长安大学经济与管理学院,西安 710064)
中图分类号 R969.3;TP181 文献标志码 A 文章编号 1001-0408(2025)11-1399-06
DOI 10.6039/j.issn.1001-0408.2025.11.20
摘 要 目的 构建用于预测丙戊酸(VPA)血药浓度的三分类(不足、正常、超限)和二分类(不足、正常)模型,并比较这2种模型
的性能,为临床制定用药方案提供参考。方法 收集2022年11月-2024年9月在西安国际医学中心医院接受VPA治疗并进行血
药浓度检测的480名患者的临床数据(共695份数据)。分别针对三分类和二分类模型的目标变量构建预测模型,利用XGBoost特
征重要性评分进行特征排名和选取,采用12种机器学习算法进行训练和验证,并通过准确率、F1分数及受试者工作特征曲线下面
积(AUC)3个指标对模型的性能进行评价。结果 在三分类模型中,合并肾病和合并电解质紊乱的XGBoost特征重要性评分排名
较高;然而在二分类模型中,这些特征的重要性排名显著降低,提示其与VPA血药浓度超限之间存在紧密的关联。在三分类模型
中,随机森林法表现最佳,但其测试集F1分数仅达到0.704 0,AUC仅为0.519 3;而在二分类模型中,CatBoost方法表现最佳,其测
试集F1分数为0.785 7,AUC达到了0.819 5。结论 本研究构建的三分类模型具有预测VPA血药浓度超限的能力,但预测及模型
泛化能力较差;构建的二分类模型仅能对血药浓度不足和正常情况进行分类预测,但模型预测性能较强。
关键词 丙戊酸;机器学习;血药浓度预测;小样本数据集;模型对比
Comparison of small-sample multi-class machine learning models for plasma concentration prediction of
valproic acid
CHEN Xi ,YUAN Shen’ao ,YUAN Hailing ,ZHAO Jie ,CHEN Peng ,TIAN Chunyan ,SU Yi ,ZHANG
1
2
1
1
2
3
1
Yunsong ,ZHANG Yu(1. Dept. of Pharmacy, Xi’an International Medical Center Hospital, Xi’an 710100,
1
1
China;2. School of Information Engineering, Chang’an University, Xi’an 710064, China;3. School of
Economics and Management, Chang’an University, Xi’an 710064, China)
ABSTRACT OBJECTIVE To construct three-class (insufficient, normal, excessive) and two-class (insufficient, normal)
models for predicting plasma concentration of valproic acid (VPA), and compare the performance of these two models, with the
aim of providing a reference for formulating clinical medication strategies. METHODS The clinical data of 480 patients who
received VPA treatment and underwent blood concentration test at the Xi’an International Medical Center Hospital were collected
from November 2022 to September 2024 (a total of 695 sets of data). In this study, predictive models were constructed for target
variables of three-class and two-class models. Feature ranking and selection were carried out using XGBoost scores. Twelve different
machine learning algorithms were used for training and validation, and the performance of the models was evaluated using three
indexes: accuracy, F1 score, and the area under the working characteristic curve of the subject (AUC). RESULTS XGBoost
feature importance scores revealed that in the three-class model, the importance ranking of kidney disease and electrolyte disorders
was higher. However, in the two-class model, the importance ranking of these features significantly decreased, suggesting a close
association with the excessive blood concentration of VPA. In the three-class model, Random Forest method performed best, with
F1 score of 0.704 0 and AUC of 0.519 3 on the test set; while in the two-class model, CatBoost method performed optimally, with
F1 score of 0.785 7 and AUC of 0.819 5 on the test set. CONCLUSIONS The constructed three-class model has the ability to
predict excessive VPA blood concentration, but its prediction and model generalization abilities are poor; the constructed two-class
model can only perform classification prediction for
Δ 基金项目 陕西省自然科学基础研究计划(No.2022JQ-657);西 insufficient and normal blood concentration cases, but its
安国际医学中心医院院级课题青年项目(No.2024QN11) model performance is stronger.
*第一作者 主管药师,硕士。研究方向:精准药学服务。E-mail: KEYWORDS valproic acid; machine learning; plasma
cxi9@foxmail.com
# 通信作者 主任药师,硕士。研究方向:精准药学服务与药事管 concentration prediction; small-sample dataset; model
理。E-mail:aliceyuanhailing@163.com comparison
中国药房 2025年第36卷第11期 China Pharmacy 2025 Vol. 36 No. 11 · 1399 ·