cs.CV(2025-03-11)

📊 共 8 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱二:RL算法与架构 (RL & Architecture) (1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
1 SpurLens: Automatic Detection of Spurious Cues in Multimodal LLMs SpurLens:自动检测多模态LLM中的虚假线索,提升模型可靠性 large language model multimodal
2 Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach 提出基于BERT和DINOv2的多模态情感分析框架,融合文本和图像信息以提升情感理解。 multimodal
3 Open-World Skill Discovery from Unsegmented Demonstrations 提出基于自监督学习的技能边界检测方法,从无分割演示视频中发现技能 instruction following

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
4 NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields 提出NeRF-VIO以解决基于地图的视觉惯性定位问题 VIO NeRF neural radiance field
5 Acoustic Neural 3D Reconstruction Under Pose Drift 提出声学神经3D重建算法,联合优化场景表示和传感器位姿,解决位姿漂移问题。 implicit representation

🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)

#题目一句话要点标签🔗
6 GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training 提出GTR框架,解决RL训练VLM Agent时出现的思维坍塌问题 reinforcement learning large language model chain-of-thought

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
7 Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments 提出关键点语义融合方法,提升户外农业环境中特征匹配的鲁棒性 feature matching

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
8 DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness DexGrasp Anything:提出物理约束感知的通用灵巧抓取扩散模型 dexterous hand

⬅️ 返回 cs.CV 首页 · 🏠 返回主页