cs.CV(2025-04-27)

📊 共 12 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (4) 支柱九:具身大模型 (Embodied Foundation Models) (3 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱一:机器人控制 (Robot Control) (1) 支柱四:生成式动作 (Generative Motion) (1 🔗1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
1 OpenFusion++: An Open-vocabulary Real-time Scene Understanding System 提出OpenFusion++,实现开放词汇实时场景理解,提升3D感知的精度和响应速度。 scene understanding open-vocabulary open vocabulary
2 Rendering Anywhere You See: Renderability Field-guided Gaussian Splatting 提出基于可渲染性场引导的高斯溅射方法,提升场景视角合成的渲染稳定性。 gaussian splatting splatting
3 IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos 提出IM-Portrait,一种基于单目视频的3D感知视频扩散方法,用于生成逼真的说话人头部视频。 NeRF geometric consistency
4 Leveraging Multi-Modal Saliency and Fusion for Gaze Target Detection 提出一种融合多模态显著性和单目深度信息的注视目标检测方法 depth estimation monocular depth

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
5 HoloDx: Knowledge- and Data-Driven Multimodal Diagnosis of Alzheimer's Disease HoloDx:融合知识与数据的多模态阿尔茨海默病诊断框架 large language model multimodal
6 MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis MERA:一种低标注需求的多模态多尺度自解释肺结节诊断模型 multimodal
7 DeepSPG: Exploring Deep Semantic Prior Guidance for Low-light Image Enhancement with Multimodal Learning 提出DeepSPG以解决低光照图像增强中的语义信息缺失问题 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
8 DeepInsert: Early Layer Bypass for Efficient and Performant Multimodal Understanding DeepInsert:通过早期层旁路提升多模态理解的效率与性能 representation learning multimodal
9 CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis 提出CARL,实现相机无关的光谱图像表征学习,提升跨相机泛化性。 representation learning scene understanding foundation model
10 Learning to Drive from a World Model 提出基于世界模型的端到端自动驾驶学习框架,无需人工规则。 world model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
11 CapsFake: A Multimodal Capsule Network for Detecting Instruction-Guided Deepfakes CapsFake:提出多模态胶囊网络,用于检测指令引导的深度伪造图像。 manipulation multimodal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
12 Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions 生成式AI赋能角色动画:全面综述技术、应用与未来方向 motion synthesis character animation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页