cs.CV（2025-03-11）

📊 共 8 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (2) 支柱二：RL算法与架构 (RL & Architecture) (1) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	SpurLens: Automatic Detection of Spurious Cues in Multimodal LLMs	SpurLens：自动检测多模态LLM中的虚假线索，提升模型可靠性	large language model multimodal
2	Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach	提出基于BERT和DINOv2的多模态情感分析框架，融合文本和图像信息以提升情感理解。	multimodal
3	Open-World Skill Discovery from Unsegmented Demonstrations	提出基于自监督学习的技能边界检测方法，从无分割演示视频中发现技能	instruction following	✅

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
4	NeRF-VIO: Map-Based Visual-Inertial Odometry with Initialization Leveraging Neural Radiance Fields	提出NeRF-VIO以解决基于地图的视觉惯性定位问题	VIO NeRF neural radiance field
5	Acoustic Neural 3D Reconstruction Under Pose Drift	提出声学神经3D重建算法，联合优化场景表示和传感器位姿，解决位姿漂移问题。	implicit representation

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training	提出GTR框架，解决RL训练VLM Agent时出现的思维坍塌问题	reinforcement learning large language model chain-of-thought

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments	提出关键点语义融合方法，提升户外农业环境中特征匹配的鲁棒性	feature matching

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness	DexGrasp Anything：提出物理约束感知的通用灵巧抓取扩散模型	dexterous hand

⬅️ 返回 cs.CV 首页 · 🏠 返回主页