cs.CV(2026-02-19)

📊 共 21 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (8 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (8 篇)

#题目一句话要点标签🔗
1 3D Scene Rendering with Multimodal Gaussian Splatting 提出基于多模态高斯溅射的3D场景渲染方法,提升恶劣环境下的重建质量。 3D gaussian splatting gaussian splatting splatting
2 NRGS-SLAM: Monocular Non-Rigid SLAM for Endoscopy via Deformation-Aware 3D Gaussian Splatting NRGS-SLAM:基于形变感知3D高斯溅射的内窥镜单目非刚性SLAM 3D gaussian splatting gaussian splatting splatting
3 B$^3$-Seg: Camera-Free, Training-Free 3DGS Segmentation via Analytic EIG and Beta-Bernoulli Bayesian Updates B$^3$-Seg:无需相机、无需训练,基于解析EIG和Beta-Bernoulli贝叶斯更新的3DGS分割 3D gaussian splatting 3DGS gaussian splatting
4 Cholec80-port: A Geometrically Consistent Trocar Port Segmentation Dataset for Robust Surgical Scene Understanding 提出几何一致的Cholec80-port数据集,提升手术场景理解的鲁棒性 visual SLAM scene understanding geometric consistency
5 4D Monocular Surgical Reconstruction under Arbitrary Camera Motions 提出Local-EndoGS,解决任意相机运动下单目内窥镜手术场景的4D重建问题 monocular depth stereo depth 3D gaussian splatting
6 Neural Implicit Representations for 3D Synthetic Aperture Radar Imaging 提出基于神经隐式表示的3D合成孔径雷达成像方法,解决稀疏采样下的重建伪影问题。 implicit representation
7 Inferring Height from Earth Embeddings: First insights using Google AlphaEarth 利用AlphaEarth嵌入,结合深度学习回归模型,实现区域地表高度精确映射。 height map multimodal
8 IntRec: Intent-based Retrieval with Contrastive Refinement 提出IntRec交互式目标检索框架,通过对比精炼用户意图提升复杂场景下的检索精度。 open-vocabulary open vocabulary

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
9 EAGLE: Expert-Augmented Attention Guidance for Tuning-Free Industrial Anomaly Detection in Multimodal Large Language Models EAGLE:专家增强注意力引导,用于多模态大语言模型中免调优的工业异常检测 large language model multimodal
10 EntropyPrune: Matrix Entropy Guided Visual Token Pruning for Multimodal Large Language Models EntropyPrune:基于矩阵熵的多模态大语言模型视觉Token剪枝 large language model multimodal
11 When Vision Overrides Language: Evaluating and Mitigating Counterfactual Failures in VLAs 提出Counterfactual Action Guidance,提升VLA在机器人控制中对语言指令的遵循能力 vision-language-action VLA
12 GraphThinker: Reinforcing Video Reasoning with Event Graph Thinking GraphThinker:通过事件图推理增强视频理解,减少视频推理中的幻觉问题。 large language model multimodal visual grounding
13 QuPAINT: Physics-Aware Instruction Tuning Approach to Quantum Material Discovery 提出QuPAINT:一种物理感知指令调优方法,用于量子材料发现。 large language model multimodal
14 Art2Mus: Artwork-to-Music Generation via Visual Conditioning and Large-Scale Cross-Modal Alignment Art2Mus:提出基于视觉条件和大规模跨模态对齐的艺术作品到音乐生成框架 multimodal
15 Selective Training for Large Vision Language Models via Visual Information Gain 提出基于视觉信息增益的选择性训练方法,提升大视觉语言模型的视觉 grounding 能力。 visual grounding

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
16 BadCLIP++: Stealthy and Persistent Backdoors in Multimodal Contrastive Learning BadCLIP++:提出隐蔽且持久的多模态对比学习后门攻击框架 contrastive learning multimodal
17 SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery 提出SpectralGCD,利用谱概念选择和跨模态表示学习解决广义类别发现问题。 representation learning distillation multimodal
18 RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward RetouchIQ:基于通用奖励的MLLM智能体,用于指令驱动的图像修饰 reinforcement learning large language model multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
19 Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline 提出一种基于对比学习和相似性引导的篡改文档数据生成流程,提升篡改检测模型性能。 manipulation contrastive learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
20 PartRAG: Retrieval-Augmented Part-Level 3D Generation and Editing PartRAG:提出检索增强的部件级3D生成与编辑框架,提升生成质量和编辑能力。 physically plausible

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
21 EA-Swin: An Embedding-Agnostic Swin Transformer for AI-Generated Video Detection 提出EA-Swin,用于提升AI生成视频检测的泛化性和准确性 spatiotemporal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页