cs.CV(2025-12-28)

📊 共 24 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (11 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (2) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (11 篇)

#题目一句话要点标签🔗
1 Next Best View Selections for Semantic and Dynamic 3D Gaussian Splatting 提出基于Fisher信息的主动学习算法,用于语义和动态3D高斯溅射的视角选择。 3D gaussian splatting gaussian splatting splatting
2 RGS-SLAM: Robust Gaussian Splatting SLAM with One-Shot Dense Initialization RGS-SLAM:基于高斯溅射和单次密集初始化的鲁棒SLAM gaussian splatting splatting
3 Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image 针对低质量图像,评估开放词汇目标检测模型的性能 open-vocabulary open vocabulary
4 With Great Context Comes Great Prediction Power: Classifying Objects via Geo-Semantic Scene Graphs 提出基于地理语义场景图的上下文感知对象分类框架,显著提升识别精度。 metric depth spatial relationship large language model
5 ColaVLA: Leveraging Cognitive Latent Reasoning for Hierarchical Parallel Trajectory Planning in Autonomous Driving ColaVLA:利用认知潜在推理进行自动驾驶分层并行轨迹规划 scene understanding vision-language-action multimodal
6 Medical Scene Reconstruction and Segmentation based on 3D Gaussian Representation 提出基于3D高斯表示的医学场景重建与分割方法,解决稀疏切片下的结构不连续问题。 scene reconstruction multimodal
7 Depth Anything in $360^\circ$: Towards Scale Invariance in the Wild 提出DA360以解决360度全景深度估计的尺度不变性问题 depth estimation Depth Anything
8 EgoReAct: Egocentric Video-Driven 3D Human Reaction Generation EgoReAct:提出一种基于第一视角视频的3D人体反应生成框架,解决空间对齐和因果性问题。 metric depth egocentric
9 Split4D: Decomposed 4D Scene Reconstruction Without Video Segmentation 提出Split4D,无需视频分割即可实现分解的4D场景重建。 scene reconstruction
10 Hash Grid Feature Pruning 提出哈希网格特征剪枝方法,减少高斯溅射隐式神经场中的冗余存储。 gaussian splatting splatting
11 3D Scene Change Modeling With Consistent Multi-View Aggregation 提出SCaR-3D框架,解决3D场景变化检测中的空间不一致性和状态分离问题 3DGS scene reconstruction

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
12 MUSON: A Reasoning-oriented Multimodal Dataset for Socially Compliant Navigation in Urban Environments MUSON:面向城市环境社交合规导航的推理型多模态数据集 multimodal chain-of-thought
13 OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding 提出OpenGround,通过主动认知推理解决开放世界3D视觉定位问题 visual grounding
14 SwinTF3D: A Lightweight Multimodal Fusion Approach for Text-Guided 3D Medical Image Segmentation SwinTF3D:一种轻量级多模态融合方法,用于文本引导的3D医学图像分割 multimodal
15 M-ErasureBench: A Comprehensive Multimodal Evaluation Benchmark for Concept Erasure in Diffusion Models M-ErasureBench:用于评估扩散模型概念擦除的多模态综合基准测试 multimodal
16 TrimTokenator-LC: Towards Adaptive Visual Token Pruning for Large Multimodal Models with Long Contexts TrimTokenator-LC:针对长上下文大模型,提出自适应视觉Token剪枝方法 multimodal
17 An Architecture-Led Hybrid Report on Body Language Detection Project 基于架构分析,利用视觉-语言模型实现肢体语言检测的混合报告 multimodal instruction following
18 JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation JavisGPT:用于音视频理解与生成的多模态统一大语言模型 large language model multimodal
19 VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM VPTracker:利用视觉提示和MLLM实现全局视觉-语言跟踪 large language model multimodal
20 Plug In, Grade Right: Psychology-Inspired AGIQA 提出基于心理测量学的AGIQA模型,解决语义漂移问题,提升图像质量评估准确性。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
21 Improved cystic hygroma detection from prenatal imaging using ultrasound-specific self-supervised representation learning 利用超声特有自监督学习提升产前影像中囊性淋巴管瘤的检测 representation learning MAE foundation model
22 YOLO-IOD: Towards Real Time Incremental Object Detection 提出YOLO-IOD,解决YOLO框架下增量目标检测的灾难性遗忘问题,实现实时增量学习。 world model distillation

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
23 ByteLoom: Weaving Geometry-Consistent Human-Object Interactions through Progressive Curriculum Learning ByteLoom:通过渐进式课程学习编织几何一致的人-物交互视频 manipulation imitation learning curriculum learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
24 Guided Path Sampling: Steering Diffusion Models Back on Track with Principled Path Guidance 提出引导路径采样(GPS),通过约束采样路径解决扩散模型迭代优化中的不稳定问题。 classifier-free guidance

⬅️ 返回 cs.CV 首页 · 🏠 返回主页