cs.CV(2024-06-06)

📊 共 24 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (9 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗2) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (9 篇)

#题目一句话要点标签🔗
1 Improving Gaussian Splatting with Localized Points Management 提出局部化点管理策略,提升高斯溅射模型在复杂区域的渲染质量 3D gaussian splatting 3DGS gaussian splatting
2 Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image Flash3D:基于单张图像的前馈可泛化3D场景重建 depth estimation monocular depth gaussian splatting
3 A Survey on 3D Human Avatar Modeling -- From Reconstruction to Generation 综述3D人体Avatar建模技术:从重建到生成,探索最新进展与挑战 3D gaussian splatting gaussian splatting splatting
4 How Far Can We Compress Instant-NGP-Based NeRF? 提出基于上下文建模的NeRF压缩框架CNC,显著降低Instant-NGP模型存储空间。 NeRF neural radiance field occupancy grid
5 Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization 提出基于拉格朗日粒子优化的PAC-NeRF,提升少样本视角下几何体和物理属性识别精度。 NeRF neural radiance field
6 Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling Gear-NeRF:基于运动感知时空采样的自由视角渲染与跟踪 NeRF neural radiance field
7 DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data DIRECT-3D:提出一种基于扩散模型的大规模噪声3D数据直接文本到3D生成方法 neural radiance field
8 Neural Surface Reconstruction from Sparse Views Using Epipolar Geometry EpiS:利用极几何进行稀疏视图神经表面重建,显著提升重建精度。 monocular depth
9 GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions GeoGen:提出基于SDF的几何感知生成模型,提升三维几何体和图像生成质量。 neural radiance field

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
10 Conv-INR: Convolutional Implicit Neural Representation for Multimodal Visual Signals 提出Conv-INR,一种基于卷积的隐式神经表示方法,提升多模态视觉信号的表示能力。 multimodal
11 Understanding Information Storage and Transfer in Multi-modal Large Language Models 提出多模态大语言模型信息溯源方法,揭示视觉问答中的信息存储与传递机制。 large language model
12 Evaluating Durability: Benchmark Insights into Multimodal Watermarking 评估水印技术的鲁棒性以应对多模态挑战 multimodal
13 DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs DeepStack:一种简单有效的视觉tokens深度堆叠方法,提升大型多模态模型性能 large language model multimodal
14 Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment 提出SDA框架,通过合成域对齐提升扩散驱动的测试时自适应性能 large language model multimodal
15 Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt 提出双模态对抗提示攻击以解决视觉语言模型的安全性问题 large language model chain-of-thought
16 OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference OCCAM:面向成本效益和精度感知的分类推理模型选择方法 foundation model
17 Nomic Embed Vision: Expanding the Latent Space Nomic Embed Vision:构建与文本共享潜在空间的高性能图像嵌入模型 multimodal
18 Parameter-Inverted Image Pyramid Networks 提出参数反转图像金字塔网络(PIIP),在保证性能的同时降低图像金字塔的计算成本。 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
19 MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation MambaDepth:利用Mamba架构增强自监督单目深度估计中的长程依赖 Mamba SSM state space model
20 CDMamba: Incorporating Local Clues into Mamba for Remote Sensing Image Binary Change Detection CDMamba:融合局部信息的Mamba模型用于遥感图像二元变化检测 Mamba state space model
21 ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs 提出残差编码蒸馏(ReDistill)方法,在显著降低CNN峰值内存消耗的同时保持性能。 teacher-student distillation

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
22 RoboMamba: Efficient Vision-Language-Action Model for Robotic Reasoning and Manipulation RoboMamba:高效的视觉-语言-动作模型,用于机器人推理与操作 manipulation Mamba SSM
23 Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction 提出Superpoint Gaussian Splatting,用于实时高保真动态场景重建 manipulation 3D gaussian splatting gaussian splatting

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
24 LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model LLplace:基于大语言模型的3D室内场景布局生成与编辑 spatial relationship large language model

⬅️ 返回 cs.CV 首页 · 🏠 返回主页