Kimi K2.5: Visual Agentic Intelligence

📄 arXiv: 2602.02276v1 📥 PDF

作者: Kimi Team, Tongtong Bai, Yifan Bai, Yiping Bao, S. H. Cai, Yuan Cao, Y. Charles, H. S. Che, Cheng Chen, Guanduo Chen, Huarong Chen, Jia Chen, Jiahao Chen, Jianlong Chen, Jun Chen, Kefan Chen, Liang Chen, Ruijue Chen, Xinhao Chen, Yanru Chen, Yanxu Chen, Yicun Chen, Yimin Chen, Yingjiang Chen, Yuankun Chen, Yujie Chen, Yutian Chen, Zhirong Chen, Ziwei Chen, Dazhi Cheng, Minghan Chu, Jialei Cui, Jiaqi Deng, Muxi Diao, Hao Ding, Mengfan Dong, Mengnan Dong, Yuxin Dong, Yuhao Dong, Angang Du, Chenzhuang Du, Dikang Du, Lingxiao Du, Yulun Du, Yu Fan, Shengjun Fang, Qiulin Feng, Yichen Feng, Garimugai Fu, Kelin Fu, Hongcheng Gao, Tong Gao, Yuyao Ge, Shangyi Geng, Chengyang Gong, Xiaochen Gong, Zhuoma Gongque, Qizheng Gu, Xinran Gu, Yicheng Gu, Longyu Guan, Yuanying Guo, Xiaoru Hao, Weiran He, Wenyang He, Yunjia He, Chao Hong, Hao Hu, Jiaxi Hu, Yangyang Hu, Zhenxing Hu, Ke Huang, Ruiyuan Huang, Weixiao Huang, Zhiqi Huang, Tao Jiang, Zhejun Jiang, Xinyi Jin, Yu Jing, Guokun Lai, Aidi Li, C. Li, Cheng Li, Fang Li, Guanghe Li, Guanyu Li, Haitao Li, Haoyang Li, Jia Li, Jingwei Li, Junxiong Li, Lincan Li, Mo Li, Weihong Li, Wentao Li, Xinhang Li, Xinhao Li, Yang Li, Yanhao Li, Yiwei Li, Yuxiao Li, Zhaowei Li, Zheming Li, Weilong Liao, Jiawei Lin, Xiaohan Lin, Zhishan Lin, Zichao Lin, Cheng Liu, Chenyu Liu, Hongzhang Liu, Liang Liu, Shaowei Liu, Shudong Liu, Shuran Liu, Tianwei Liu, Tianyu Liu, Weizhou Liu, Xiangyan Liu, Yangyang Liu, Yanming Liu, Yibo Liu, Yuanxin Liu, Yue Liu, Zhengying Liu, Zhongnuo Liu, Enzhe Lu, Haoyu Lu, Zhiyuan Lu, Junyu Luo, Tongxu Luo, Yashuo Luo, Long Ma, Yingwei Ma, Shaoguang Mao, Yuan Mei, Xin Men, Fanqing Meng, Zhiyong Meng, Yibo Miao, Minqing Ni, Kun Ouyang, Siyuan Pan, Bo Pang, Yuchao Qian, Ruoyu Qin, Zeyu Qin, Jiezhong Qiu, Bowen Qu, Zeyu Shang, Youbo Shao, Tianxiao Shen, Zhennan Shen, Juanfeng Shi, Lidong Shi, Shengyuan Shi, Feifan Song, Pengwei Song, Tianhui Song, Xiaoxi Song, Hongjin Su, Jianlin Su, Zhaochen Su, Lin Sui, Jinsong Sun, Junyao Sun, Tongyu Sun, Flood Sung, Yunpeng Tai, Chuning Tang, Heyi Tang, Xiaojuan Tang, Zhengyang Tang, Jiawen Tao, Shiyuan Teng, Chaoran Tian, Pengfei Tian, Ao Wang, Bowen Wang, Chensi Wang, Chuang Wang, Congcong Wang, Dingkun Wang, Dinglu Wang, Dongliang Wang, Feng Wang, Hailong Wang, Haiming Wang, Hengzhi Wang, Huaqing Wang, Hui Wang, Jiahao Wang, Jinhong Wang, Jiuzheng Wang, Kaixin Wang, Linian Wang, Qibin Wang, Shengjie Wang, Shuyi Wang, Si Wang, Wei Wang, Xiaochen Wang, Xinyuan Wang, Yao Wang, Yejie Wang, Yipu Wang, Yiqin Wang, Yucheng Wang, Yuzhi Wang, Zhaoji Wang, Zhaowei Wang, Zhengtao Wang, Zhexu Wang, Zihan Wang, Zizhe Wang, Chu Wei, Ming Wei, Chuan Wen, Zichen Wen, Chengjie Wu, Haoning Wu, Junyan Wu, Rucong Wu, Wenhao Wu, Yuefeng Wu, Yuhao Wu, Yuxin Wu, Zijian Wu, Chenjun Xiao, Jin Xie, Xiaotong Xie, Yuchong Xie, Yifei Xin, Bowei Xing, Boyu Xu, Jianfan Xu, Jing Xu, Jinjing Xu, L. H. Xu, Lin Xu, Suting Xu, Weixin Xu, Xinbo Xu, Xinran Xu, Yangchuan Xu, Yichang Xu, Yuemeng Xu, Zelai Xu, Ziyao Xu, Junjie Yan, Yuzi Yan, Guangyao Yang, Hao Yang, Junwei Yang, Kai Yang, Ningyuan Yang, Ruihan Yang, Xiaofei Yang, Xinlong Yang, Ying Yang, Yi Yang, Yi Yang, Zhen Yang, Zhilin Yang, Zonghan Yang, Haotian Yao, Dan Ye, Wenjie Ye, Zhuorui Ye, Bohong Yin, Chengzhen Yu, Longhui Yu, Tao Yu, Tianxiang Yu, Enming Yuan, Mengjie Yuan, Xiaokun Yuan, Yang Yue, Weihao Zeng, Dunyuan Zha, Haobing Zhan, Dehao Zhang, Hao Zhang, Jin Zhang, Puqi Zhang, Qiao Zhang, Rui Zhang, Xiaobin Zhang, Y. Zhang, Yadong Zhang, Yangkun Zhang, Yichi Zhang, Yizhi Zhang, Yongting Zhang, Yu Zhang, Yushun Zhang, Yutao Zhang, Yutong Zhang, Zheng Zhang, Chenguang Zhao, Feifan Zhao, Jinxiang Zhao, Shuai Zhao, Xiangyu Zhao, Yikai Zhao, Zijia Zhao, Huabin Zheng, Ruihan Zheng, Shaojie Zheng, Tengyang Zheng, Junfeng Zhong, Longguang Zhong, Weiming Zhong, M. Zhou, Runjie Zhou, Xinyu Zhou, Zaida Zhou, Jinguo Zhu, Liya Zhu, Xinhao Zhu, Yuxuan Zhu, Zhen Zhu, Jingze Zhuang, Weiyu Zhuang, Ying Zou, Xinxing Zu

分类: cs.CL

发布日期: 2026-02-02

备注: Kimi K2.5 tech report


💡 一句话要点

Kimi K2.5:面向通用智能体的视觉代理智能模型与Agent Swarm框架

🎯 匹配领域: 支柱二:RL算法与架构 (RL & Architecture) 支柱九:具身大模型 (Embodied Foundation Models)

关键词: 多模态学习 智能体 视觉代理 Agent Swarm 并行计算 文本视觉联合优化 强化学习

📋 核心要点

  1. 现有智能体模型在处理复杂任务时,缺乏对多模态信息的有效融合与利用,限制了其通用性和性能。
  2. Kimi K2.5通过联合优化文本和视觉模态,并引入Agent Swarm框架,实现复杂任务的动态分解和并行执行。
  3. 实验结果表明,Kimi K2.5在多个领域取得了领先成果,Agent Swarm框架显著降低了任务处理延迟。

📝 摘要(中文)

本文介绍了Kimi K2.5,一个旨在推进通用智能体智能的开源多模态代理模型。K2.5强调文本和视觉的联合优化,使两种模态相互增强。这包括一系列技术,如联合文本-视觉预训练、零视觉SFT和联合文本-视觉强化学习。在此多模态基础上,K2.5引入了Agent Swarm,一个自导向的并行代理编排框架,可将复杂任务动态分解为异构子问题并并发执行。广泛的评估表明,Kimi K2.5在包括编码、视觉、推理和代理任务在内的各个领域都取得了最先进的结果。与单代理基线相比,Agent Swarm还将延迟降低了高达4.5倍。我们发布了经过后训练的Kimi K2.5模型检查点,以促进代理智能的未来研究和实际应用。

🔬 方法详解

问题定义:现有智能体模型在处理复杂任务时,往往难以有效融合文本和视觉信息,导致在需要多模态理解和推理的任务中表现不佳。此外,单智能体在处理复杂任务时效率较低,难以满足实际应用的需求。

核心思路:Kimi K2.5的核心思路是联合优化文本和视觉模态,使两种模态相互增强,从而提升模型的多模态理解能力。同时,引入Agent Swarm框架,将复杂任务分解为多个子任务,并分配给不同的代理并行执行,从而提高任务处理效率。

技术框架:Kimi K2.5的技术框架主要包括三个部分:联合文本-视觉预训练、零视觉SFT和联合文本-视觉强化学习。首先,通过联合文本-视觉预训练,使模型具备初步的多模态理解能力。然后,通过零视觉SFT,进一步提升模型在视觉任务上的性能。最后,通过联合文本-视觉强化学习,优化模型的决策能力。Agent Swarm框架则负责将复杂任务分解为多个子任务,并根据子任务的特点,选择合适的代理执行。

关键创新:Kimi K2.5的关键创新在于:1) 联合优化文本和视觉模态,使两种模态相互增强;2) 引入Agent Swarm框架,实现复杂任务的动态分解和并行执行。与现有方法相比,Kimi K2.5能够更有效地利用多模态信息,并显著提高任务处理效率。

关键设计:论文中没有详细描述关键参数设置、损失函数、网络结构等技术细节,这部分信息未知。

🖼️ 关键图片

fig_0
fig_1
fig_2

📊 实验亮点

Kimi K2.5在编码、视觉、推理和代理任务等多个领域取得了最先进的结果。Agent Swarm框架与单代理基线相比,将延迟降低了高达4.5倍。这些结果表明,Kimi K2.5在多模态理解和任务处理效率方面具有显著优势。

🎯 应用场景

Kimi K2.5具有广泛的应用前景,例如智能客服、自动驾驶、智能家居、机器人等领域。它可以用于处理需要多模态理解和推理的复杂任务,例如视觉问答、图像描述、目标检测等。Agent Swarm框架可以显著提高任务处理效率,降低延迟,从而提升用户体验。

📄 摘要(原文)

We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5 introduces Agent Swarm, a self-directed parallel agent orchestration framework that dynamically decomposes complex tasks into heterogeneous sub-problems and executes them concurrently. Extensive evaluations show that Kimi K2.5 achieves state-of-the-art results across various domains including coding, vision, reasoning, and agentic tasks. Agent Swarm also reduces latency by up to $4.5\times$ over single-agent baselines. We release the post-trained Kimi K2.5 model checkpoint to facilitate future research and real-world applications of agentic intelligence.