| 1 |
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments |
Qwen-VLA:统一视觉-语言-动作建模,实现跨任务、环境和机器人形态的通用具身智能 |
manipulation egocentric vision-language-action |
|
|
| 2 |
BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models |
BORA:桥接离线强化学习与在线残差自适应,用于真实世界灵巧VLA模型 |
manipulation dexterous hand dexterous manipulation |
|
|
| 3 |
Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation |
Gaze2Act:利用注视引导的视觉-语言-动作策略实现交互式机器人操作 |
humanoid manipulation Unitree |
|
|
| 4 |
VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models |
VLA-Pro:面向视觉-语言-动作模型的跨任务程序记忆迁移框架 |
manipulation vision-language-action VLA |
|
|
| 5 |
VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation |
提出VE2VF,通过真实世界强化学习蒸馏实现接触式操作的鲁棒性 |
manipulation domain randomization reinforcement learning |
|
|
| 6 |
3DVLA: Enhancing Vision-Language-Action Models via 3D Spatial and Instance Understanding |
提出3DVLA框架,通过3D空间和实例理解增强视觉-语言-动作模型 |
manipulation scene understanding vision-language-action |
|
|
| 7 |
Phase-Conditioned Imitation Learning with Autonomous Failure Recovery for Robust Deformable Object Manipulation |
提出基于阶段条件模仿学习与自主故障恢复的柔性物体操作框架 |
manipulation dual-arm teleoperation |
✅ |
|
| 8 |
PhAIL: A Real-Robot VLA Benchmark and Distributional Methodology |
PhAIL:基于Franka FR3真实机器人的VLA基准测试与分布评估方法 |
teleoperation vision-language-action VLA |
|
|
| 9 |
VLAConf: Calibrated Task-Success Confidence for Vision-Language-Action Models |
提出VLAConf,用于校准视觉-语言-动作模型的任务成功置信度,提升机器人操作的可靠性。 |
manipulation vision-language-action VLA |
|
|
| 10 |
ElegantVLA: Learning When to Think for Efficient Vision-Language-Action Models |
ElegantVLA:通过学习何时思考,加速高效的视觉-语言-动作模型 |
manipulation vision-language-action VLA |
|
|
| 11 |
MARS Policy: Multimodality Only When It Matters |
提出MARS策略,自适应地在机器人操作中引入多模态,提升效率。 |
manipulation imitation learning multimodal |
|
|
| 12 |
Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance |
提出CGPO:一种基于Critic引导的扩散强化学习方法,提升样本效率。 |
locomotion reinforcement learning diffusion policy |
|
|
| 13 |
MonoDuo: Using One Robot Arm to Learn Bimanual Policies |
MonoDuo:利用单臂机器人学习双臂操作策略,解决双臂机器人数据稀缺问题 |
manipulation bi-manual bimanual manipulation |
|
|
| 14 |
Decentralized LLM-Driven Coordination of Acoustic Robots for Contactless Object Manipulation |
提出基于LLM的分布式声学机器人协同框架,实现非接触式物体操作 |
manipulation large language model |
|
|
| 15 |
DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation |
提出DynaFLIP以解决机器人感知中的动态理解问题 |
manipulation VLA multimodal |
|
|
| 16 |
LLM-Guided Future Hypotheses for Horizon-Aware Exploration in Multi-Step Robot Manipulation |
提出基于LLM引导的未来假设方法,用于多步机器人操作中的前瞻性探索。 |
manipulation reinforcement learning |
|
|
| 17 |
RoboWits: Unexpected Challenges for Robotic Creative Problem Solving |
RoboWits:提出双臂机器人创造性问题解决基准,应对意外挑战。 |
manipulation bi-manual VLA |
✅ |
|
| 18 |
A Progress-Aware Leader-Follower Midair Docking System for Dual-Drone Aerial Manipulation |
提出一种基于进度感知的双无人机空中对接系统,用于空中操作 |
manipulation |
|
|
| 19 |
The Open Motion Planning Library 2.0 |
OMPL 2.0:面向实时运动规划的硬件加速开源库 |
motion planning |
|
|
| 20 |
Extreme dynamic symmetry enables omnidirectional and multifunctional robots |
提出动态对称性概念,设计全向多功能机器人Argus,提升敏捷性与鲁棒性。 |
locomotion |
|
|
| 21 |
ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation |
ELAN4D:通过即插即用适配,实现以具身认知为中心的VLA模型4D监督 |
manipulation vision-language-action VLA |
|
|
| 22 |
Physics-informed Goal-Conditioned Reinforcement Learning under Hybrid Contact Dynamics |
提出接触感知的分层物理信息强化学习,解决接触动力学下的目标条件强化学习问题 |
manipulation reinforcement learning contact-aware |
|
|
| 23 |
Any-ttach: Quick End-effector Swapping Enables Manipulation Dexterity with Simplicity |
Any-ttach:通过快速末端执行器切换实现灵巧操作,简化机器人操作复杂性 |
manipulation |
✅ |
|
| 24 |
ARISTO Hand: Sensing-Driven Distal Hyperextension for Fine-Grained Manipulation |
ARISTO Hand:通过力觉驱动远端过伸实现精细操作 |
manipulation |
|
|
| 25 |
VLM-GLoc: Vision-Language Model Enhanced Monte Carlo Localization for Robust Semantic Global Localization in Cluttered Quasi-Static Environments |
VLM-GLoc:利用视觉-语言模型增强蒙特卡洛定位,实现复杂准静态环境下的鲁棒语义全局定位 |
quadruped open-vocabulary open vocabulary |
|
|