RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning

作者: Yichao Zhong, Yidan Lu, Yuhang Lu, Tianyang Tang, Haoguang Mai, Yixuan Pan, Tianyu Li, Li Chen, Jingbo Wang, Zhongyu Li, Peng Lu, Hongyang Li

分类: cs.RO, cs.AI

发布日期: 2026-06-09

💡 一句话要点

提出RoboNaldo以解决人形机器人足球射门稳定性与准确性问题

🎯 匹配领域: 支柱一：机器人控制 (Robot Control) 支柱二：RL算法与架构 (RL & Architecture) 支柱八：物理动画 (Physics-based Animation)

关键词: 人形机器人 足球射门 强化学习 运动引导 课程学习 动态适应 高冲击力交互

📋 核心要点

现有方法在足球射门中难以适应不同的球位置和击球时机，导致稳定性和准确性不足。
本文提出RoboNaldo，通过三阶段的运动引导课程强化学习框架，逐步优化射门性能。
在模拟和实际环境中，RoboNaldo的射门误差和速度均显著优于现有基线，展示了良好的应用潜力。

📝 摘要（中文）

人形机器人在足球射门中需要全身稳定性、高冲击力的全身互动以及对目标的准确性。基于运动跟踪的强化学习方法虽然能提供稳定的全身运动协调，但固定参考使其难以适应不同的球位置和击球时机；而基于任务奖励的强化学习则难以从零开始探索有效的射门方式。为此，本文提出了RoboNaldo，一个三阶段的运动引导课程强化学习框架，旨在实现高冲击力的人形互动。该框架通过一个单一的人类踢球参考作为支架，逐步优化射门性能。实验结果表明，RoboNaldo在自由球射门中误差降低了48.6%，射门速度提高了2.96倍。

🔬 方法详解

问题定义：本文旨在解决人形机器人在足球射门中的稳定性和准确性问题。现有方法在适应不同球位置和击球时机方面存在不足，导致射门效果不理想。

核心思路：RoboNaldo采用三阶段的运动引导课程强化学习框架，首先学习稳定的全身踢球先验，然后适应静止球的自由球设置，最后扩展到移动球射门。通过这种方式，逐步优化射门性能。

技术框架：RoboNaldo的整体架构分为三个阶段：第一阶段学习稳定的踢球动作，第二阶段在随机位置的静止球上进行训练，第三阶段通过运动指令和踢球触发接口进行移动球射门训练。高层启发式规划器在训练过程中控制接口。

关键创新：RoboNaldo的创新点在于引入了运动引导的课程学习策略，使得机器人能够在复杂的射门环境中逐步适应并优化其射门能力。这一方法与传统的固定参考强化学习方法有本质区别。

关键设计：在训练过程中，采用了高层规划器来控制运动指令，并设计了适应不同射门场景的损失函数和网络结构，以确保机器人能够有效学习并执行复杂的射门动作。

🖼️ 关键图片

📊 实验亮点

RoboNaldo在模拟环境中实现了自由球射门误差降低48.6%，射门速度提高2.96倍。在实际应用中，Unitree G1机器人在自由球和移动球情况下，分别达到了0.73米和0.86米的平均目标射门误差，后续球速达到13.10米/秒，显示出显著的性能提升。

🎯 应用场景

RoboNaldo的研究成果在机器人足球、体育训练和人形机器人交互等领域具有广泛的应用潜力。其高效的射门能力可以用于开发更智能的机器人运动员，提升人机协作的效果，并推动机器人在动态环境中的适应能力。

📄 摘要（原文）

Elite humanoid soccer shooting requires whole-body stability, high-impulse whole-body interactions, and accuracy to targets. Motion tracking-driven reinforcement learning (RL) provides stability in whole-body movement coordination, but a fixed reference makes it hard to adapt to varied ball positions and strike timings; in contrast, task reward-driven RL struggles to explore and discover valid kicks from scratch. We therefore introduce RoboNaldo, a three-stage motion-guided curriculum RL framework for high-impulse humanoid interaction. A single human-kick reference is used as a scaffold and progressively shifts optimization towards shooting performance. The curriculum first learns a stable whole-body kicking prior, then adapts the kick to free-kick settings where the ball is stationary at random positions, and finally extends it to moving-ball shooting through a locomotion-command and kick-trigger interface. A high-level heuristic planner controls this interface during training, while alternative high-level controllers can drive the same low-level policy at inference. In simulation, RoboNaldo demonstrates free-kick shot error 48.6% lower and shoot velocity 2.96x than prior work baselines. In real world on a Unitree G1 with onboard perception, RoboNaldo attains 0.73 m and 0.86 m average target shooting error from 3 m away in free-kick and moving-ball cases, accordingly. And the post-contact ball velocity reaches 13.10 m/s, which is 59-71% of reported professional open-play shot speed. Project page: $\href{https://opendrivelab.com/RoboNaldo}{\text{opendrivelab.com/RoboNaldo}}$.

RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

🖼️ 关键图片

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理