cs.CV(2024-09-19)

📊 共 23 篇论文 | 🔗 8 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (9 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗2) 支柱五:交互与反应 (Interaction & Reaction) (1 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (9 篇)

#题目一句话要点标签🔗
1 DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input DrivingForward:基于前馈3D高斯溅射的灵活环视驾驶场景重建 3D gaussian splatting gaussian splatting splatting
2 3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt 提出基于Levenberg-Marquardt算法的3DGS-LM,加速3D高斯溅射优化。 3D gaussian splatting 3DGS gaussian splatting
3 Spectral-GS: Taming 3D Gaussian Splatting with Spectral Entropy Spectral-GS:利用谱熵改进3D高斯溅射,解决伪影和模糊问题 3D gaussian splatting gaussian splatting splatting
4 GaRField++: Reinforced Gaussian Radiance Fields for Large-Scale 3D Scene Reconstruction GaRField++:基于强化高斯辐射场的大规模三维场景重建 3D gaussian splatting 3DGS gaussian splatting
5 GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling GStex:通过高斯图元纹理化解耦外观与几何建模,提升新视角合成质量。 gaussian splatting splatting scene reconstruction
6 EdgeGaussians -- 3D Edge Mapping via Gaussian Splatting EdgeGaussians:提出基于高斯溅射的3D边缘显式映射方法,提升边缘重建效率。 gaussian splatting splatting implicit representation
7 End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting 提出基于多模态Prompting的端到端开放词汇视频视觉关系检测框架。 open-vocabulary open vocabulary
8 UL-VIO: Ultra-lightweight Visual-Inertial Odometry with Noise Robust Test-time Adaptation 提出UL-VIO,一种超轻量级且噪声鲁棒的视觉惯性里程计,支持测试时自适应。 VIO
9 Interpretable Action Recognition on Hard to Classify Actions 针对易混淆行为,提出基于3D感知的可解释行为识别模型 depth estimation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
10 InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning 提出InfiMM-WebMath-40B多模态数学推理预训练数据集,提升大语言模型数学能力 large language model multimodal
11 JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images 提出JourneyBench:一个用于评估生成图像视觉-语言理解能力的综合基准 large language model multimodal chain-of-thought
12 Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition Under Missing Modalities 提出RAMER框架,利用检索增强解决缺失模态下的多模态情感识别问题 multimodal
13 MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines 提出MMSearch以解决多模态搜索引擎的潜力评估问题 large language model multimodal
14 Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning 提出KCFI框架,通过关键变化特征引导和指令调优,提升遥感图像变化描述的准确性。 large language model multimodal
15 Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner 提出INTP-Video-LLMs,无需训练即可扩展Video-LLM处理长视频能力 large language model
16 Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Oryx:提出按需时空理解MLLM,解决任意分辨率视觉数据处理难题 multimodal
17 Frequency-Guided Spatial Adaptation for Camouflaged Object Detection 提出频率引导的空间自适应方法,提升伪装目标检测性能 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
18 MambaRecon: MRI Reconstruction with Structured State Space Models MambaRecon:利用结构化状态空间模型加速磁共振成像重建 Mamba state space model
19 MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation MambaClinix:结合分层门控卷积与Mamba的U型网络,提升3D医学图像分割性能 Mamba SSM state space model
20 Bayesian-Optimized One-Step Diffusion Model with Knowledge Distillation for Real-Time 3D Human Motion Prediction 提出基于贝叶斯优化和知识蒸馏的单步扩散模型,实现实时3D人体运动预测 distillation
21 Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks 提出混合逐步蒸馏SNN,解决事件相机视觉识别中低延迟与高精度难以兼顾的问题。 distillation

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
22 HSIGene: A Foundation Model For Hyperspectral Image Generation 提出HSIGene,一种用于高光谱图像生成的多条件控制基础模型 HSI foundation model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
23 EventDance++: Language-guided Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition EventDance++:提出一种语言引导的无监督源域无关跨模态事件物体识别方法 spatiotemporal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页