A Reflective Storytelling Agent for Older Adults: Integrating Argumentation Schemes and Argument Mining in LLM-Based Personalised Narratives
作者: Jayalakshmi Baskar, Vera C. Kaelin, Kaan Kilic, Helena Lindgren
分类: cs.AI
发布日期: 2026-05-11
备注: Submitted to ACM Transactions on Intelligent Systems and Technology (TIST)
💡 一句话要点
提出一种基于论证挖掘与知识图谱的反射式叙事代理,以提升老年人数字陪伴的叙事质量与可信度。
🎯 匹配领域: 支柱九:具身大模型 (Embodied Foundation Models)
关键词: 大语言模型 论证挖掘 老年人数字陪伴 知识图谱 叙事生成 可解释人工智能 用户建模
📋 核心要点
- 针对大语言模型在叙事生成中存在的幻觉问题及缺乏透明度,导致其在老年人健康陪伴场景中难以保证叙事的可信度与逻辑一致性。
- 提出一种反射式叙事代理框架,融合知识图谱、用户建模与论证理论,通过论证挖掘技术对生成内容进行实时审查与质量评估。
- 实验表明,该方法能显著提升叙事的个人相关性与逻辑清晰度,论证质量指标可作为衡量生成内容可靠性的有效代理指标。
📝 摘要(中文)
本研究探讨了基于知识驱动的大语言模型(LLM)叙事能否支持老年人与数字伴侣进行有目的的叙事互动。针对LLM存在的幻觉及透明度不足等局限,作者提出了一种反射式叙事代理,通过集成知识图谱、用户建模、论证理论及论证挖掘技术,对叙事生成过程进行引导与审查。研究分为两个阶段:第一阶段通过11位领域专家的参与式设计对系统进行迭代优化,确保叙事基于健康促进活动与动机的结构化用户模型;第二阶段邀请55位老年人对不同创意水平的叙事进行评估。结果显示,约三分之二的叙事被认为具有个人相关性,论证质量指标与叙事的清晰度及意义感呈正相关,而高幻觉风险指标则与叙事不一致性感知相关。研究表明,论证挖掘可作为一种有效的反射式审查机制,用于校准健康导向的LLM叙事。
🔬 方法详解
问题定义:论文旨在解决大语言模型在为老年人提供个性化叙事时,因缺乏外部知识约束和逻辑审查机制,导致内容产生幻觉、逻辑不连贯以及缺乏针对性动机引导的问题。
核心思路:引入“反射式”设计理念,将论证理论(Argumentation Theory)作为叙事生成的逻辑骨架,通过论证挖掘(Argument Mining)对LLM输出进行事后审查,确保叙事内容既符合用户健康模型,又具备合理的逻辑结构。
技术框架:系统架构包含四个核心模块:一是基于知识图谱的用户建模模块,用于存储健康活动与动机;二是叙事生成模块,利用LLM结合论证方案(Argumentation Schemes)生成内容;三是论证挖掘模块,负责解析生成文本的逻辑结构;四是反射式评估模块,计算幻觉风险与论证质量指标,实现闭环反馈。
关键创新:将论证挖掘技术从传统的文本分析领域引入到生成式AI的质量控制中,通过形式化的论证结构对比,实现了对LLM生成内容的“可解释性”约束,而非仅仅依赖概率预测。
关键设计:系统采用了基于参与式设计的迭代开发流程,将健康促进活动作为叙事的核心锚点;通过计算论证质量指标(Argument-quality indicators)与幻觉风险指标(Hallucination-risk indicators)作为系统性能的量化评估依据。
🖼️ 关键图片
📊 实验亮点
研究通过55位老年人的实证评估发现,约66%的叙事被识别出具有个人相关性。实验数据表明,论证质量指标与叙事的清晰度及意义感呈正相关,而高幻觉风险指标与用户感知到的叙事不一致性显著相关,验证了论证挖掘作为反射式审查机制的有效性。
🎯 应用场景
该研究主要应用于老年人数字健康陪伴领域,通过个性化叙事促进老年人的心理健康与行为改变。其技术框架可扩展至医疗咨询、心理辅导及教育辅助系统,为需要高可信度、强逻辑性及个性化交互的AI应用提供了一种可行的质量控制范式。
📄 摘要(原文)
This work investigates whether knowledge-driven large language model (LLM)-based storytelling can support purposeful narrative interaction with a digital companion for older adults. To address known limitations of LLMs, including hallucinations and limited transparency, we present a reflective storytelling agent integrating knowledge graphs, user modelling, argumentation theory, and argument mining to guide and inspect narrative generation. The study consisted of two phases. Phase I employed participatory design involving 11 domain experts in a formative evaluation that informed iterative refinement. The resulting system generates narratives grounded in structured user models representing health-promoting activities and motivations. Phase II involved 55 older adults evaluating persona-based narratives across four prompts and two creativity levels. Participants assessed perceived purpose, usefulness, cultural relatability, and inconsistencies. The system additionally computed hallucination-risk indicators to evaluate generated narratives. Participants recognised personally relevant purposes in roughly two thirds of narratives, while argument-based purposes were identified in around half of these cases. Cultural recognisability strongly influenced willingness to use the functionality, whereas minor inconsistencies were often tolerated when narratives remained understandable and personally relevant. Narratives with higher hallucination-risk indicators were more often perceived as inconsistent, while higher argument-quality indicators tended to co-occur with higher clarity and meaningfulness ratings. Overall, the study positions argument mining as a reflective inspection mechanism for comparing formal grounding signals with human evaluations in health-oriented LLM storytelling for older adults.