ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data

作者: Jiangyuan Wang, Xuyong Chen, Junwei He, Xu Xu, Shasha Xie, Fuman Han

分类: cs.LG, cs.AI

发布日期: 2026-05-21

备注: 14 pages, 2 figures, 6 tables

💡 一句话要点

提出ChronoMedicalWorld模型以解决长期临床数据中的患者轨迹预测问题

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture) 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 慢性病护理 患者轨迹预测 潜在世界模型 电子健康记录 生理状态演变 干预策略 个性化医疗

📋 核心要点

现有的电子健康记录模型主要是判别性的，难以有效预测患者在长期干预下的生理变化。
提出了ChronoMedicalWorld模型，通过结合状态编码器和动作编码器，学习患者的长期轨迹。
在慢性肾病的案例研究中，CMWM在预测精度上显著优于调优后的GPT-5.5基线，MAE和RMSE分别降低了7.28%和7.35%。

📝 摘要（中文）

长期临床模拟，即在特定干预下预测患者生理状态的演变，是慢性病护理的核心。然而，现有的电子健康记录模型主要是判别性的，且通用的大型语言模型在重复干预下表现不佳。为此，本文提出了ChronoMedicalWorld模型（CMWM），这是一个基于动作条件的潜在世界模型框架，旨在从纵向护理数据中学习患者轨迹。CMWM将联合嵌入状态编码器与宽动作编码器结合，后者接受结构化干预指标和自由文本通信嵌入，并在六项目标下训练递归潜在转移模块。作为具体案例，CMWM被应用于慢性肾病的年估计肾小管滤过率（eGFR）轨迹预测，取得了显著的性能提升。

🔬 方法详解

问题定义：本文旨在解决慢性病护理中患者生理状态长期演变的预测问题。现有方法主要依赖判别模型，无法有效处理重复干预下的动态变化。

核心思路：CMWM模型通过动作条件的潜在世界模型框架，结合状态和动作编码器，能够更好地捕捉患者轨迹的复杂性和多样性。

技术框架：CMWM的整体架构包括联合嵌入状态编码器、宽动作编码器和递归潜在转移模块，训练过程中采用六项目标进行优化，以确保模型在推理时的表现与训练时一致。

关键创新：CMWM的主要创新在于其动作条件的潜在世界模型设计，能够处理结构化干预和自由文本通信的嵌入，显著提升了长期预测的准确性。

关键设计：模型的损失函数设计包括下一观察监督、下一潜在预测、SIGReg潜在正则化，以及生理感知的形状先验（如斜率、连续性和大跳惩罚），确保模型在多种慢性病场景下的适用性。

📊 实验亮点

在慢性肾病的案例研究中，CMWM模型在动态50%历史回滚测试中实现了7.384的平均绝对误差（MAE）和10.256的均方根误差（RMSE），相比调优后的GPT-5.5基线分别降低了7.28%和7.35%。这一提升主要得益于患者与健康教练之间的对话部分。

🎯 应用场景

该研究的潜在应用领域包括慢性病管理、个性化医疗和长期健康监测。通过准确预测患者的生理轨迹，医疗提供者可以更好地制定干预策略，从而提高患者的生活质量和治疗效果。未来，该模型的框架可以扩展到其他慢性疾病的预测与管理中。

📄 摘要（原文）

Long-horizon clinical simulation -- predicting how a patient's physiology evolves over years under specified interventions -- is central to chronic-disease care, yet existing electronic health record (EHR) models are predominantly discriminative, and general-purpose large language models drift under repeated interventions. We propose the \textbf{ChronoMedicalWorld Model (CMWM)}, an action-conditioned latent world-model framework for learning patient trajectories from longitudinal care data. CMWM couples a joint-embedding state encoder with a wide action encoder that admits both structured intervention indicators and free-text communication embeddings, and trains a recurrent latent transition module under a six-term objective: next-observation supervision, next-latent prediction, SIGReg latent regularisation, and three physiology-aware shape priors (slope, continuity, large-jump penalty). A closed-loop rollout-prefix protocol matches training to deployment, so the model is optimised against the same multi-step error it exhibits at inference. As a concrete case study, we instantiate CMWM for annual estimated glomerular filtration rate (eGFR) trajectory forecasting in chronic kidney disease (CKD). On a 2{,}232-patient nephrology cohort, the CKD instantiation achieves a dynamic-50\% history rollout test mean absolute error (MAE) of 7.384 and root-mean-square error (RMSE) of 10.256, against 7.964 and 11.069 for a tuned GPT-5.5 structured-prompting baseline ($-7.28\%$ MAE, $-7.35\%$ RMSE), with the gain dominated by the dialogue portion of patient--health-coach communication. The framework is not CKD-specific: its architecture, loss design, and training protocol apply to any chronic condition that can be cast as periodic clinical state interleaved with structured and conversational interventions.

ChronoMedicalWorld: A Medical World Model for Learning Patient Trajectories from Longitudinal Care Data

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理