FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

作者: Ziyi Yang, Fanqi Wan, Longguang Zhong, Canbin Huang, Guosheng Liang, Xiaojun Quan

分类: cs.CL

发布日期: 2025-03-06

备注: Technical report

🔗 代码/项目: GITHUB

💡 一句话要点

FuseChat-3.0：融合异构大模型偏好优化，提升紧凑型目标模型性能

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture) 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 大语言模型 模型融合 偏好优化 知识迁移 指令遵循

📋 核心要点

现有小型语言模型在复杂任务上表现不足，难以充分利用大型模型的知识和能力。
FuseChat-3.0通过融合异构大型语言模型的优势，并使用偏好优化方法，提升小型目标模型的性能。
实验表明，FuseChat-3.0在多个基准测试中显著提升，尤其在指令遵循方面提升显著。

📝 摘要（中文）

FuseChat-3.0是一套大型语言模型（LLM），通过将异构源LLM的优势集成到更紧凑的目标LLM中而开发。源模型包括Gemma-2-27B-it、Mistral-Large-Instruct-2407、Qwen-2.5-72B-Instruct和Llama-3.1-70B-Instruct。目标模型侧重于三个广泛使用的小型变体——Llama-3.1-8B-Instruct、Gemma-2-9B-it和Qwen-2.5-7B-Instruct，以及两个超紧凑选项Llama-3.2-3B-Instruct和Llama-3.2-1B-Instruct。为了利用这些源模型的多样化能力，开发了专门的数据构建协议，以适应各种任务和领域。FuseChat-3.0训练流程包括两个关键阶段：（1）监督微调（SFT），用于对齐目标模型和源模型分布；（2）直接偏好优化（DPO），应用来自多个源LLM的偏好来微调目标模型。生成的FuseChat-3.0模型在指令遵循、通用知识、数学和编码等任务中表现出显著的性能提升。

🔬 方法详解

问题定义：现有小型语言模型在指令遵循、通用知识、数学和编码等任务上的性能与大型模型存在差距。如何有效地将大型模型的知识和能力迁移到小型模型，同时保持模型的紧凑性和效率，是一个重要的挑战。现有方法可能无法充分利用不同大型模型的优势，或者在迁移过程中引入噪声和偏差。

核心思路：FuseChat-3.0的核心思路是通过融合多个异构大型语言模型的优势，并利用直接偏好优化（DPO）方法，将这些优势迁移到小型目标模型。通过专门的数据构建协议和两阶段训练流程，实现目标模型与源模型分布的对齐，并学习源模型的偏好。

技术框架：FuseChat-3.0的训练流程包括两个主要阶段：1) 监督微调（SFT）：使用专门构建的数据集，对目标模型进行微调，使其与源模型的分布对齐。2) 直接偏好优化（DPO）：利用多个源模型的偏好信息，对目标模型进行进一步的微调，使其更好地遵循指令并生成高质量的输出。

关键创新：FuseChat-3.0的关键创新在于异构模型融合和偏好优化相结合。通过融合来自不同架构和训练数据的大型模型，可以获得更全面的知识和能力。DPO方法可以直接优化模型的偏好，避免了传统强化学习方法的复杂性和不稳定性。

关键设计：数据构建协议针对不同任务和领域进行定制，确保训练数据的质量和多样性。SFT阶段使用交叉熵损失函数，DPO阶段使用标准的DPO损失函数。目标模型选择广泛使用的小型变体，如Llama-3.1-8B-Instruct、Gemma-2-9B-it和Qwen-2.5-7B-Instruct，以及超紧凑选项Llama-3.2-3B-Instruct和Llama-3.2-1B-Instruct。

🖼️ 关键图片

📊 实验亮点

FuseChat-3.0在多个基准测试中表现出显著的性能提升。例如，使用Llama-3.1-8B-Instruct作为目标模型，平均提升6.8个点（14个基准测试）。在指令遵循基准测试AlpacaEval-2和Arena-Hard上，分别实现了37.1和30.1个点的显著提升。这些结果表明，FuseChat-3.0能够有效地融合异构模型的优势，并提升小型模型的性能。

🎯 应用场景

FuseChat-3.0具有广泛的应用前景，可用于开发更高效、更智能的对话系统、智能助手和教育工具。通过将大型模型的知识迁移到小型模型，可以在资源受限的环境中部署高性能的语言模型，例如移动设备和嵌入式系统。该研究还有助于推动模型压缩和知识迁移领域的发展。

📄 摘要（原文）

We introduce FuseChat-3.0, a suite of large language models (LLMs) developed by integrating the strengths of heterogeneous source LLMs into more compact target LLMs. Our source models include the powerful Gemma-2-27B-it, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For target models, we focus on three widely-used smaller variants-Llama-3.1-8B-Instruct, Gemma-2-9B-it, and Qwen-2.5-7B-Instruct-along with two ultra-compact options, Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. To leverage the diverse capabilities of these source models, we develop a specialized data construction protocol tailored to various tasks and domains. The FuseChat-3.0 training pipeline consists of two key stages: (1) supervised fine-tuning (SFT) to align the target and source model distributions, and (2) Direct Preference Optimization (DPO) to apply preferences from multiple source LLMs to fine-tune the target model. The resulting FuseChat-3.0 models exhibit significant performance gains across tasks such as instruction following, general knowledge, mathematics, and coding. As illustrated in Figure 1, using Llama-3.1-8B-Instruct as the target model, our fusion approach achieves an average improvement of 6.8 points across 14 benchmarks. Moreover, it demonstrates remarkable gains of 37.1 points and 30.1 points on the instruction-following benchmarks AlpacaEval-2 and Arena-Hard, respectively. Our code, models, and datasets are available at https://github.com/SLIT-AI/FuseChat-3.0.

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理