cs.CV(2025-07-23)

📊 共 7 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (4) 支柱九:具身大模型 (Embodied Foundation Models) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
1 URPO: A Unified Reward & Policy Optimization Framework for Large Language Models URPO:统一奖励与策略优化框架,提升大语言模型对齐效果 reinforcement learning large language model instruction following
2 From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding 提出基于真实扫描的场景理解方法,提升LLM场景编辑和机器人策略学习效果 policy learning scene understanding
3 Eyes Will Shut: A Vision-Based Next GPS Location Prediction Model by Reinforcement Learning from Visual Map Feed Back 提出基于视觉地图反馈强化学习的下一GPS位置预测模型VLMLocPredictor reinforcement learning
4 PIG-Nav: Key Insights for Pretrained Image Goal Navigation Models PIG-Nav:基于预训练图像的目标导航模型关键技术洞察 representation learning foundation model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
5 Dual-branch Prompting for Multimodal Machine Translation 提出D2P-MMT,利用双分支Prompt和扩散模型提升多模态机器翻译的鲁棒性。 multimodal
6 Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras 提出Talk2Event基准和EventRefer框架,用于事件相机驱动的动态场景语言理解。 multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
7 Monocular Semantic Scene Completion via Masked Recurrent Networks 提出基于掩码循环网络的单目语义场景补全方法,提升复杂场景补全效果。 depth estimation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页