cs.RO(2025-12-17)

📊 共 18 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱一:机器人控制 (Robot Control) (6) 支柱三:空间感知与语义 (Perception & Semantics) (4) 支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱二:RL算法与架构 (RL & Architecture) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱一:机器人控制 (Robot Control) (6 篇)

#题目一句话要点标签🔗
1 mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs 提出mimic-video,一种基于视频的动作模型,提升机器人控制的泛化性和样本效率。 manipulation flow matching vision-language-action
2 ISS Policy : Scalable Diffusion Policy with Implicit Scene Supervision 提出基于隐式场景监督的可扩展扩散策略,提升机器人操作任务的泛化性和训练效率。 manipulation dexterous hand imitation learning
3 GuangMing-Explorer: A Four-Legged Robot Platform for Autonomous Exploration in General Environments GuangMing-Explorer:用于通用环境自主探索的四足机器人平台 legged robot
4 SORS: A Modular, High-Fidelity Simulator for Soft Robots SORS:用于软体机器人的模块化、高保真模拟器,提升仿真到现实的迁移能力。 sim-to-real
5 QuantGraph: A Receding-Horizon Quantum Graph Solver 提出QuantGraph,一种基于后退视界的量子图求解器,提升图优化效率。 model predictive control
6 Load-Based Variable Transmission Mechanism for Robotic Applications 提出基于负载的可变传动机制,无需额外执行器即可动态调整机器人关节扭矩。 legged robot

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
7 VLA-AN: An Efficient and Onboard Vision-Language-Action Framework for Aerial Navigation in Complex Environments VLA-AN:用于复杂环境无人机自主导航的高效、端侧视觉-语言-动作框架 3D gaussian splatting gaussian splatting splatting
8 OMCL: Open-vocabulary Monte Carlo Localization 提出基于视觉-语言特征的开放词汇蒙特卡洛定位方法,提升跨模态地图环境下的机器人定位鲁棒性。 open-vocabulary open vocabulary
9 NAP3D: NeRF Assisted 3D-3D Pose Alignment for Autonomous Vehicles NAP3D:NeRF辅助的3D-3D位姿对齐,用于提升自动驾驶车辆定位精度 NeRF neural radiance field geometric consistency
10 HERO: Hierarchical Traversable 3D Scene Graphs for Embodied Navigation Among Movable Obstacles HERO:用于可移动障碍物环境具身导航的分层可遍历3D场景图 traversability

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
11 Large Video Planner Enables Generalizable Robot Control 提出基于大规模视频预训练的通用机器人控制框架,实现零样本泛化。 vision-language-action VLA large language model
12 MiVLA: Towards Generalizable Vision-Language-Action Model with Human-Robot Mutual Imitation Pre-training MiVLA:通过人-机互模仿预训练实现通用视觉-语言-动作模型 vision-language-action VLA
13 Trust in LLM-controlled Robotics: a Survey of Security Threats, Defenses and Challenges 综述LLM控制机器人中的安全威胁与防御,为安全可靠的机器人系统提供蓝图 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
14 Modeling the Mental World for Embodied AI: A Comprehensive Review 构建具身AI心智模型:提出完整理论框架,促进人机协作 world model embodied AI
15 SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Arbitration for UAV Navigation in Cluttered Environments 提出SWIFT-Nav,结合模糊仲裁与TD3,提升UAV在复杂环境导航的稳定性和效率。 TD3 reward shaping

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
16 Few-Shot Inference of Human Perceptions of Robot Performance in Social Navigation Scenarios 利用少量样本,通过大语言模型预测社交导航中人类对机器人性能的感知 human motion large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
17 BEV-Patch-PF: Particle Filtering with BEV-Aerial Feature Matching for Off-Road Geo-Localization 提出BEV-Patch-PF,利用BEV特征匹配的粒子滤波实现越野环境无GPS定位 feature matching

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
18 dLITE: Differentiable Lighting-Informed Trajectory Evaluation for On-Orbit Inspection 提出dLITE以解决轨道检查中图像质量优化问题 differentiable simulation

⬅️ 返回 cs.RO 首页 · 🏠 返回主页