Page 108 - 《中国药房》2026年8期

P. 108

·智慧药学·

本地化大语言模型在胃癌术前药物重整中的应用模式构建与实践 Δ

1
1
1， 2
1 #
1， 2
1， 2
朱宇轩 1， 2＊，张冀中，孙雨豪，温佳瑜，刘欣，魏继福，黄凌莉（1.江苏省肿瘤医院药学部/江苏省肿瘤
防治研究所/南京医科大学附属肿瘤医院/江苏省恶性肿瘤先进诊疗重点实验室，南京 210009；2.中国药科大
学基础医学与临床药学学院，南京 211198）
中图分类号 R969.3；R735.2 文献标志码 A 文章编号 1001-0408（2026）08-1062-06
DOI 10.6039/j.issn.1001-0408.2026.08.16

摘要目的构建本地化大语言模型（LLM）辅助胃癌术前药物重整模式，并进行效果评价。方法回顾性纳入 2024 年 1 月至
2026年1月江苏省肿瘤医院胃外科249例入院前存在持续用药史的胃癌患者。根据时间先后将患者划分为训练集（154例）和验
证集（95 例）。基于指南、药品说明书等证据，构建标准化药物重整流程与结构化知识库，并在院内私有化部署 DeepSeek-V3
LLM，结合检索增强生成技术，实现对用药信息的自动整合、风险筛查以及个体化建议生成。采用机器评分（BERT Score 和
ROUGE-1、2、L）与人工评分[七维指标（7DI）]评价LLM生成建议的质量；运用Spearman相关分析探究机器评分与人工评分的相
关性；采用Cronbach’s α系数检验人工评分结果的内部一致性；比较不同难易程度（简单、中等、高难度3个等级）药物重整任务的
人工与LLM药物重整耗时。结果最终构建了涵盖8大类药物、能够覆盖常见及高风险术前用药场景的结构化知识库。机器评分
方面，BERT Score 的精确率为 0.783±0.033，召回率为 0.811±0.038，F1 分数为（0.796±0.028）分；ROUGE-1、ROUGE-2 和
ROUGE-L 3 个层级的 F1 分数分别为（0.566±0.067）、（0.338±0.076）和（0.468±0.082）分。3 名人工评分者的 7DI 评分为 32.06～
33.45分。机器评分的F1分数与人工评分的7DI评分均呈显著正相关（最高决定系数＝0.611，P＜0.001），且人工评分内部一致性
良好（Cronbach’s α＝0.876）。在效率方面，与人工药物重整耗时比较，LLM药物重整耗时在简单组、中等组、高难度组中均减少
90%以上（P＜0.001）。结论基于本地化LLM与结构化知识库构建的药物重整模式，在胃癌术前复杂用药场景中具有较高的准
确性、一致性和临床可用性，能够提升药物重整效率，同时降低潜在用药风险。
关键词药物重整；人工智能；大语言模型；胃癌；术前用药；用药安全

Construction and practice of application model for localized large language model in preoperative
medication reconciliation for gastric cancer
1， 2
1， 2
ZHU Yuxuan ，ZHANG Jizhong ，SUN Yuhao ，WEN Jiayu ，LIU Xin ，WEI Jifu ，HUANG Lingli
1， 2
1
1， 2
1
1
（1. Dept. of Pharmacy， Jiangsu Cancer Hospital/Jiangsu Institute of Cancer Research/Nanjing Medical University
Affiliated Cancer Hospital/Jiangsu Key Laboratory of Innovative Cancer Diagnosis and Therapeutics， Nanjing
210009， China；2. School of Basic Medicine and Clinical Pharmacy， China Pharmaceutical University， Nanjing
211198， China）
ABSTRACT OBJECTIVE To construct a preoperative medication reconciliation model assisted by a localized large language
model （LLM） for gastric cancer and evaluate its clinical efficacy. METHODS A total of 249 gastric cancer patients with a history
of continuous medication before admission in the Gastric Surgery Department of Jiangsu Cancer Hospital were retrospectively
enrolled. Patients were divided into training set （154 cases） and validation set （95 cases） based on the order of time. Based on
guidelines， drug package inserts， and other evidence， a standardized medication reconciliation process and a structured knowledge
base were constructed. DeepSeek-V3 LLM was deployed privately in the hospital， combined with retrieval-augmented generation
technology， to achieve automated integration of medication information， risk screening， and generation of personalized
recommendations. The quality of LLM-generated recommendations was evaluated using automatic metrics （BERT Score and
ROUGE-1， 2， L） and manual scoring [seven-dimensional
Δ 基金项目江苏省药学会 -“ 药”研新声药学科研项目（No.
index （7DI）]. Spearman correlation analysis was performed to
202564052）；江苏省肿瘤医院科技发展基金项目（No.ZYGL202402，
explore the correlation between automatic scores and manual
No.ZYGL202502）；江苏省肿瘤医院高水平医院学科建设“群峰计划”
scores. Cronbach’s α coefficient was used to test the internal
项目（No.DFXK202501）
＊第一作者硕士研究生。研究方向：医院药学。 E-mail： consistency of manual scoring results. The time consumed by
yuxuanzhu_123@163.com manual and LLM-assisted medication reconciliation was compared
# 通信作者主管药师，硕士。研究方向：医院药学。E-mail： across tasks of different difficulty levels （simple， moderate，
huang_lingli@163.com and high）. RESULTS A structured knowledge base covering 8

· 1062 · China Pharmacy 2026 Vol. 37 No. 8 中国药房 2026年第37卷第8期

103 104 105 106 107 108 109 110 111 112 113