ToolWeaver: Weaving Collaborative Semantics for Scalable Tool Use in Large Language Models

作者: Bowen Fang, Wen Ye, Yunyue Su, Jinghao Zhang, Qiang Liu, Yesheng Liu, Xin Sun, Shu Wu, Jiabing Yang, Baole Wei, Liang Wang

分类: cs.AI

发布日期: 2026-01-29

备注: 10pages, 12 figures, Accepted to ICLR 2026

💡 一句话要点

ToolWeaver：通过编织协作语义实现大语言模型中可扩展的工具使用

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 大语言模型 工具使用 生成式学习 分层编码 协作语义

📋 核心要点

现有基于检索的工具使用方法难以捕捉复杂语义，且LLM缺乏内在工具知识，限制了工具使用的效果。
ToolWeaver通过将工具编码为分层序列，使词汇扩展与工具数量成对数关系，从而学习工具间的协作模式。
实验结果表明，ToolWeaver在近47,000个工具的评估中显著优于现有方法，提升了工具使用的可扩展性和语义感知能力。

📝 摘要（中文）

现有的基于检索的工具使用流程面临双重语义挑战：检索器使用的编码器难以捕捉复杂的语义，而大语言模型（LLM）本身由于自然语言预训练的限制，缺乏内在的工具知识。生成式方法通过统一选择和执行提供了一种强大的替代方案，它要求LLM直接学习和生成工具标识符。然而，将每个工具映射到一个唯一新token的常见做法带来了巨大的局限性：它造成了可扩展性和泛化危机，因为词汇量呈爆炸式增长，并且每个工具都被分配了一个语义孤立的token。这种方法还造成了语义瓶颈，阻碍了协作工具关系的學習，因为模型必须从庞大库中单体工具ID的稀疏共现中推断它们。为了解决这些限制，我们提出了ToolWeaver，一种新颖的生成式工具学习框架，它将工具编码为分层序列。这种方法使词汇扩展与工具数量成对数关系。至关重要的是，它使模型能够从共享代码的密集共现中学习协作模式，而不是从单体工具ID的稀疏共现中学习。我们通过一种新颖的token化过程生成这些结构化代码，该过程旨在将工具的内在语义与其外在共用模式编织在一起。然后，通过生成式对齐阶段将这些结构化代码集成到LLM中，在该阶段对模型进行微调以生成分层代码序列。对近47,000个工具的评估结果表明，ToolWeaver显著优于最先进的方法，为高级工具增强代理建立了一个更具可扩展性、通用性和语义感知的基础。

🔬 方法详解

问题定义：现有基于检索的工具使用方法和简单的生成式工具使用方法都存在问题。基于检索的方法难以捕捉复杂语义，而简单的生成式方法（如为每个工具分配一个新token）导致词汇量爆炸，缺乏可扩展性，并且无法有效学习工具之间的协作关系。

核心思路：ToolWeaver的核心思路是将工具编码为分层序列，而不是为每个工具分配一个独立的token。这种分层编码方式使得词汇表的增长速度大大降低，并且允许模型通过共享代码的密集共现来学习工具之间的协作模式，从而克服了现有方法的局限性。

技术框架：ToolWeaver框架主要包含两个阶段：1) 结构化代码生成阶段：通过一种新颖的token化过程，将工具的内在语义和外在共用模式编织在一起，生成分层代码序列。2) 生成式对齐阶段：将这些结构化代码集成到LLM中，并对模型进行微调，使其能够生成这些分层代码序列。

关键创新：ToolWeaver的关键创新在于其分层工具编码方式和相应的token化过程。这种编码方式不仅解决了词汇量爆炸的问题，还使得模型能够更好地学习工具之间的协作关系。与现有方法为每个工具分配一个孤立的token不同，ToolWeaver通过共享代码的密集共现来表示工具之间的关系。

关键设计：ToolWeaver的关键设计包括：1) 分层代码的结构：具体的分层结构如何设计，例如每一层代表什么含义，如何选择合适的层数等。2) Token化过程：如何将工具的内在语义和外在共用模式转化为分层代码序列。3) 生成式对齐阶段的微调策略：如何选择合适的损失函数和优化器，以及如何设计训练数据，以使LLM能够有效地学习生成分层代码序列。

🖼️ 关键图片

📊 实验亮点

ToolWeaver在包含近47,000个工具的评估中，显著优于现有最先进的方法。实验结果表明，ToolWeaver能够更有效地学习工具之间的协作关系，并具有更好的可扩展性和泛化能力。具体的性能提升数据（例如，在特定任务上的准确率提升百分比）需要在论文中查找。

🎯 应用场景

ToolWeaver具有广泛的应用前景，可用于构建更智能、更强大的工具增强型代理。例如，在软件开发领域，可以利用ToolWeaver来帮助开发者自动选择和组合各种API，从而提高开发效率。在智能家居领域，可以利用ToolWeaver来控制各种智能设备，实现更智能化的家居体验。此外，ToolWeaver还可以应用于机器人控制、自动化流程等领域。

📄 摘要（原文）

Prevalent retrieval-based tool-use pipelines struggle with a dual semantic challenge: their retrievers often employ encoders that fail to capture complex semantics, while the Large Language Model (LLM) itself lacks intrinsic tool knowledge from its natural language pretraining. Generative methods offer a powerful alternative by unifying selection and execution, tasking the LLM to directly learn and generate tool identifiers. However, the common practice of mapping each tool to a unique new token introduces substantial limitations: it creates a scalability and generalization crisis, as the vocabulary size explodes and each tool is assigned a semantically isolated token. This approach also creates a semantic bottleneck that hinders the learning of collaborative tool relationships, as the model must infer them from sparse co-occurrences of monolithic tool IDs within a vast library. To address these limitations, we propose ToolWeaver, a novel generative tool learning framework that encodes tools into hierarchical sequences. This approach makes vocabulary expansion logarithmic to the number of tools. Crucially, it enables the model to learn collaborative patterns from the dense co-occurrence of shared codes, rather than the sparse co-occurrence of monolithic tool IDs. We generate these structured codes through a novel tokenization process designed to weave together a tool's intrinsic semantics with its extrinsic co-usage patterns. These structured codes are then integrated into the LLM through a generative alignment stage, where the model is fine-tuned to produce the hierarchical code sequences. Evaluation results with nearly 47,000 tools show that ToolWeaver significantly outperforms state-of-the-art methods, establishing a more scalable, generalizable, and semantically-aware foundation for advanced tool-augmented agents.

ToolWeaver: Weaving Collaborative Semantics for Scalable Tool Use in Large Language Models

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理