Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling

📄 arXiv: 2605.05855v1 📥 PDF

作者: Yiqing Wu, Haoming Li, Guanyu Jiang, Jiahao Liang, Yongchun Zhu, Jingwu Chen, Feng Zhang

分类: cs.IR, cs.CL

发布日期: 2026-05-07

备注: Accepted by SIGIR 2026


💡 一句话要点

提出PA-Bridge框架,通过主动表达建模打破对话推荐中的反馈循环与回声室效应

🎯 匹配领域: 支柱四:生成式动作 (Generative Motion) 支柱九:具身大模型 (Embodied Foundation Models)

关键词: 对话式搜索 推荐系统 对抗学习 分布对齐 语义离散化 流行度去偏 大语言模型

📋 核心要点

  1. 传统对话推荐依赖封闭的“曝光-点击”循环,导致系统陷入回声室效应,难以捕捉用户动态的开放世界搜索意图。
  2. 提出PA-Bridge框架,通过对抗分布对齐技术弥合主动查询与被动推荐启动器之间的分布差异,并利用语义离散化实现去偏。
  3. 在线A/B测试显示,该方法在工业级场景下显著提升了特征渗透率(0.54%)及用户活跃天数,验证了其在实际应用中的有效性。

📝 摘要(中文)

大语言模型驱动的对话式搜索正将信息检索从被动的关键词匹配转向主动的开放式对话。在此背景下,对话启动器(Conversation Starters)被广泛用于提供个性化查询推荐。然而,传统的推荐依赖于封闭的“曝光-点击”反馈循环,导致系统陷入回声室效应,且在数据稀疏性影响下难以捕捉动态的搜索意图,倾向于推荐大众化但平庸的内容。本文提出了一种名为PA-Bridge的新框架,旨在利用用户的主动表达(即用户手动输入的查询)来打破这一有害的反馈循环。针对主动查询与预设启动器之间的分布偏移,以及开放文本难以进行ID化统计的难题,PA-Bridge引入了对抗分布对齐器和语义离散化模块。在线A/B测试表明,该方法显著提升了特征渗透率和用户活跃天数。

🔬 方法详解

问题定义:现有对话推荐系统受限于“曝光-点击”反馈循环,导致推荐内容同质化。核心痛点在于:一是主动输入查询与预设启动器存在分布偏移;二是开放文本缺乏唯一ID,无法直接应用传统的基于物品的流行度去偏算法。

核心思路:利用用户的主动查询(Active Expression)作为“自由意志”的体现,将其引入推荐模型训练,从而打破被动推荐的封闭循环,实现对用户真实意图的捕捉。

技术框架:PA-Bridge包含两个核心模块:一是对抗分布对齐器(Adversarial Distribution Aligner),通过对抗学习将主动查询的分布映射至推荐启动器的特征空间;二是语义离散化模块(Semantic Discretizer),将连续的语义向量转化为可统计的离散单元。

关键创新:首次将用户主动输入的开放文本作为训练信号,通过对抗学习解决分布偏移问题,并利用语义离散化技术使大规模工业流式训练中的流行度去偏成为可能。

关键设计:采用对抗训练损失函数来最小化主动查询与推荐启动器之间的分布差异;语义离散化通过聚类或量化方法,将高维语义映射为离散ID,从而支持高效的流行度计算与去偏策略。

🖼️ 关键图片

fig_0
fig_1

📊 实验亮点

在真实的工业级对话搜索平台进行的A/B测试表明,PA-Bridge框架表现优异。相比基线模型,该方法将特征渗透率(Feature Penetration Rate)提升了0.54%,并有效增加了用户活跃天数,证明了其在解决数据稀疏性与打破反馈循环方面的显著优势。

🎯 应用场景

该研究适用于各类大模型驱动的对话式搜索系统、智能助手及个性化推荐平台。通过引入用户主动表达,系统能更精准地理解用户意图,提升对话启动的质量与多样性,在提升用户留存与活跃度方面具有显著的工业应用价值。

📄 摘要(原文)

Large Language Model (LLM)-driven conversational search is shifting information retrieval from reactive keyword matching to proactive, open-ended dialogues. In this context, Conversation Starters are widely deployed to provide personalized query recommendations that help users initiate dialogues. Conventionally, recommending these starters relies on a closed "exposure-click" loop. Yet, this feedback loop mechanism traps the system in an echo chamber where, compounded by data sparsity, it fails to capture the dynamic nature of conversational search intents shaped by the open world. As a result, the system skews towards popular but generic suggestions.In this work, we uncover an untapped paradigm shift to shatter this harmful feedback loop: harnessing user "free will" through active user expressions. Unlike traditional recommendations, conversational search empowers users to bypass menus entirely through manually typed queries. The open-world intents in active queries hold the key to breaking this loop. However, incorporating them is non-trivial: (1) there exists an inherent distribution shift between active queries and formulated starters. (2) Furthermore, the "non-ID-able" nature of open text renders traditional item-based popularity statistics ineffective for large-scale industrial streaming training. To this end, we propose Passive-Active Bridge (PA-Bridge), a novel framework that employs an adversarial distribution aligner to bridge the distributional gap between passively recommended starters and active expressions. Moreover, we introduce a semantic discretizer to enable the deployment of popularity debiasing algorithms. Online A/B tests on our platform, demonstrate that PA-Bridge significantly boosts the Feature Penetration Rate by 0.54% and User Active Days