ToolChoiceConfusion: Causal Minimal Tool Filtering for Reliable LLM Agents

作者: Rahul Suresh Babu, Laxmipriya Ganesh Iyer

分类: cs.AI

发布日期: 2026-06-04

💡 一句话要点

提出Causal Minimal Tool Filtering以解决工具选择混淆问题

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 工具选择 因果推理 大型语言模型 效率提升 自动化系统

📋 核心要点

现有工具选择方法过于依赖语义相关性，导致错误工具调用和效率低下。
Causal Minimal Tool Filtering（CMTF）通过因果充分性选择工具，避免不必要的工具暴露。
CMTF在102个任务的基准测试中，成功率与最强因果基线相当，同时显著降低了工具数量和令牌使用。

📝 摘要（中文）

大型语言模型代理越来越依赖外部工具，但工具菜单的增大可能导致可靠性和效率下降，增加错误工具调用、过早行动和令牌成本。现有的工具选择方法通常优化语义相关性，暴露与用户请求匹配的工具。我们认为，仅仅依赖相关性是不够的：某些工具可能与任务相关，但在当前步骤中却是不必要或过早的。我们提出了一种训练无关的方法——因果最小工具过滤（CMTF），通过因果充分性选择工具。CMTF使用轻量级前提-效果契约，仅暴露从当前状态到达用户目标所需的最小下一步工具。通过多步骤工具使用任务的比较，我们发现CMTF在任务成功率上与最强因果基线相匹配，同时将可见工具从100个减少到每步1个，并将令牌使用量减少约90%。

🔬 方法详解

问题定义：本论文旨在解决大型语言模型代理在工具选择中面临的混淆问题。现有方法往往依赖语义相关性，导致工具选择不当，增加错误调用和令牌成本。

核心思路：提出Causal Minimal Tool Filtering（CMTF），通过因果充分性来选择工具，确保仅暴露当前步骤所需的最小工具集合，避免不必要的工具干扰。

技术框架：CMTF的整体架构包括前提-效果契约的构建，工具选择的因果分析，以及基于当前状态的最小工具集暴露。这一流程确保了工具选择的高效性和准确性。

关键创新：CMTF的核心创新在于引入因果充分性作为工具选择的标准，与传统的语义相关性方法形成鲜明对比，显著提高了工具选择的可靠性。

关键设计：CMTF使用轻量级的前提-效果契约，确保工具选择过程中的计算效率和准确性，具体参数设置和契约设计未在摘要中详细说明，待进一步研究确认。

🖼️ 关键图片

📊 实验亮点

在主要基准测试中，CMTF在102个任务中表现出色，与最强因果基线的成功率相当，同时将可见工具数量从100个减少到每步1个，令牌使用量减少约90%，显示出显著的性能提升。

🎯 应用场景

该研究具有广泛的应用潜力，尤其在需要高效工具选择的领域，如自动化客服、智能助手和复杂任务管理等。通过提高工具选择的可靠性，CMTF能够显著提升用户体验和系统效率，未来可能对智能代理的设计和实现产生深远影响。

📄 摘要（原文）

Large language model agents increasingly rely on external tools, but larger tool menus can reduce reliability and efficiency by increasing wrong-tool calls, premature actions, and token cost. Existing tool-selection methods often optimize semantic relevance, exposing tools whose names or descriptions match the user request. We argue that relevance is insufficient: a tool may be related to the task while still being unnecessary or premature at the current step. We propose Causal Minimal Tool Filtering (CMTF), a training-free method that selects tools by causal sufficiency. CMTF uses lightweight precondition-effect contracts to expose only the minimal next-step tool frontier needed to advance from the current state toward the user goal. Across multi-step tool-use tasks, we compare CMTF with all-tools exposure, keyword retrieval, state-aware filtering, and causal-path ablations, measuring task success, wrong-tool calls, premature actions, tool exposure, and token cost. In the main benchmark with 102 tasks, 100 tools, four LLM backends, and 2448 task-method-model runs, CMTF matches the strongest causal baseline in aggregate success while reducing visible tools from 100 to one per step and reducing token usage by about 90% relative to all-tools exposure.

ToolChoiceConfusion: Causal Minimal Tool Filtering for Reliable LLM Agents

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理