SkillRAE: Agent Skill-Based Context Compilation for Retrieval-Augmented Execution
作者: Xiangcheng Meng, Shu Wang, Yixiang Fang
分类: cs.CL
发布日期: 2026-05-11
💡 一句话要点
提出SkillRAE框架,通过基于技能的上下文编译优化检索增强执行(RAE)
🎯 匹配领域: 支柱九:具身大模型 (Embodied Foundation Models)
关键词: 大语言模型 检索增强执行 智能体 技能库 上下文编译 知识图谱
📋 核心要点
- 现有RAE方法侧重于检索与执行,缺乏对检索到的技能证据进行有效组织,导致上下文冗余且难以直接被下游智能体高效利用。
- 提出SkillRAE框架,通过构建多层级技能图谱并引入“救援感知”的紧凑编译机制,将离散的技能证据转化为高质量的任务上下文。
- 实验表明,SkillRAE在SkillsBench基准上较当前最优方法提升了11.7%,证明了上下文编译策略对于提升智能体执行能力的必要性。
📝 摘要(中文)
基于大语言模型(LLM)的智能体日益依赖可重用技能库来处理文档中心化工作流及数据密集型分析任务。随着技能库规模扩大,检索增强执行(RAE)成为关键研究方向,其流程通常包含技能检索、上下文编译与任务执行。然而,现有研究多聚焦于检索与执行环节,忽视了如何将检索到的技能证据高效组织为紧凑、扎实且可直接使用的上下文。为填补这一空白,本文提出了SkillRAE,一种包含离线与在线阶段的技能驱动上下文编译方法。离线阶段构建了涵盖技能社区、技能及可重用子单元的多层级技能图;在线阶段则通过图检索与“救援感知”的紧凑编译技术,将粗排技能转化为任务特定的上下文。在SkillsBench等基准测试中,SkillRAE较现有SOTA方法提升了11.7%,消融实验证实了上下文编译在提升任务执行效果中的核心作用。
🔬 方法详解
问题定义:论文旨在解决LLM智能体在处理复杂任务时,检索到的技能证据与下游执行需求不匹配的问题。现有方法往往直接堆砌检索结果,导致上下文冗余、缺乏逻辑关联,限制了智能体的执行效率与准确性。
核心思路:核心思想是将上下文构建视为一个“编译”过程而非简单的拼接。通过引入结构化的技能图谱,将技能拆解为可重用的子单元,并利用救援感知机制筛选关键证据,确保上下文既紧凑又具备扎实的执行基础。
技术框架:整体分为离线与在线两个阶段。离线阶段构建多层级技能图(Skill Graph),刻画社区、技能与子单元间的层级关系;在线阶段执行两步走策略:首先进行基于图的技能排序检索,随后应用救援感知编译(Rescue-aware Compact Compilation)对证据进行精炼。
关键创新:最重要的创新在于提出了“上下文编译”范式,将检索到的原始证据转化为针对特定任务的紧凑指令集。与传统直接Prompt增强不同,该方法通过图结构显式建模技能依赖,显著提升了证据的可用性。
关键设计:关键设计包括多层级图索引结构,用于捕捉技能间的语义关联;以及救援感知编译算法,该算法通过评估证据的必要性与冗余度,动态剔除无关信息,保留对任务执行至关重要的核心逻辑片段。
🖼️ 关键图片
📊 实验亮点
SkillRAE在SkillsBench等公开基准测试中表现优异,较现有SOTA方法实现了11.7%的性能提升。消融实验明确指出,上下文编译是提升RAE性能的关键因素,其效果远超简单的Prompt拼接,证明了结构化证据组织在复杂任务执行中的核心地位。
🎯 应用场景
该研究适用于需要调用复杂工具库的自动化智能体系统,如企业级文档处理工作流、自动化数据分析平台及软件工程辅助工具。通过提升上下文的质量与紧凑度,该方法能显著增强智能体在长序列、多步骤任务中的推理与执行稳定性,具有极高的工业应用价值。
📄 摘要(原文)
Large Language Model (LLM)-based agents (e.g., OpenClaw) increasingly rely on reusable skill libraries to solve artifact-rich tasks such as document-centric workflows and data-intensive analysis. As these libraries grow, a few works have attempted to study the Retrieval-Augmented Execution (RAE), which often first retrieves some external skills and other knowledge, then compiles the context using retrieved skills, and finally executes the task. Existing works mainly focus on optimizing skill retrieval and task execution, and they pay little attention to how to effectively organize the selected skill evidence in a form that is compact, grounded, and immediately usable for the downstream executors to complete tasks. To fill this gap, we propose SkillRAE, a two-stage RAE approach focusing on skill-based context compilation, which consists of the offline and online stages. Specifically, in the offline indexing stage, it builds a multi-level skill graph over skill communities, skills, and reusable subunits, for capturing their relationships. In the online retrieval stage, it first performs skill-ranked retrieval with selected-subunit evidence export in the graph, and then applies rescue-aware compact compilation to recover the key evidence. Together, these components compile a coarse-ranked skill set into a task-specific context that is compact, grounded, and immediately usable. Experiments on two public benchmarks show that SkillRAE achieves a significant improvement over baselines for RAE. For example, on SkillsBench, it achieves an improvement of 11.7% over the SOTA method. Ablation studies further show that our context compilation is crucial, instead of a mere prompt addition.