cs.CV(2024-11-12)
📊 共 20 篇论文 | 🔗 4 篇有代码
🎯 兴趣领域导航
支柱三:空间感知与语义 (Perception & Semantics) (8 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (5 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
支柱六:视频提取与匹配 (Video Extraction) (1)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (8 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting | GaussianCut:通过图割实现3D高斯 Splatting 的交互式分割 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 2 | HiCoM: Hierarchical Coherent Motion for Streamable Dynamic Scene with 3D Gaussian Splatting | HiCoM:用于流式动态场景的层级相干运动3D高斯溅射方法 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 3 | GUS-IR: Gaussian Splatting with Unified Shading for Inverse Rendering | GUS-IR:结合统一着色与高斯溅射的逆渲染框架,适用于复杂场景。 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 4 | DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection | 提出动态原型更新(DPU)框架,解决多模态OOD检测中类内差异问题。 | optical flow multimodal | ||
| 5 | Material Transforms from Disentangled NeRF Representations | 提出基于解耦NeRF表示的材质转换方法,实现跨场景材质编辑 | NeRF neural radiance field | ✅ | |
| 6 | Projecting Gaussian Ellipsoids While Avoiding Affine Projection Approximation | 提出基于椭球投影的3D高斯溅射方法,提升新视角合成渲染质量。 | 3D gaussian splatting gaussian splatting splatting | ||
| 7 | Scaling Properties of Diffusion Models for Perceptual Tasks | 利用扩散模型的可扩展性,统一解决深度估计、光流和无模态分割等感知任务。 | depth estimation optical flow | ||
| 8 | ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions | 提出基于自适应提升的3D语义占据和基于代价体的光流预测方法 | scene understanding spatiotemporal |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data | 提出MSEG-VCUQ,融合视觉基础模型与CNN,解决高速视频相检测分割难题。 | foundation model multimodal | ||
| 10 | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | JanusFlow:融合自回归与修正流,实现统一的多模态理解与生成 | large language model multimodal | ||
| 11 | ImageRAG: Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG | ImageRAG:通过图像检索增强生成提升超高分辨率遥感图像分析能力 | large language model multimodal | ✅ | |
| 12 | SimBase: A Simple Baseline for Temporal Video Grounding | SimBase:用于时序视频定位的简单有效基线方法 | multimodal | ||
| 13 | BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions | BLIP3-KALE:提出知识增强的大规模密集图像描述数据集,提升视觉语言模型性能。 | multimodal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 14 | Aligning Visual Contrastive learning models via Preference Optimization | 提出基于偏好优化的对比学习模型对齐方法,提升模型鲁棒性和公平性。 | reinforcement learning RLHF DPO | ||
| 15 | GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation | GaussianAnything:交互式点云流匹配用于三维物体生成 | flow matching | ||
| 16 | Breaking the Low-Rank Dilemma of Linear Attention | 提出秩增强线性注意力(RALA)机制,突破线性注意力的低秩困境。 | linear attention | ✅ | |
| 17 | Flow Matching Posterior Sampling: A Training-free Conditional Generation for Flow Matching | 提出基于流匹配后验采样的免训练条件生成方法,扩展流匹配模型应用范围 | flow matching | ||
| 18 | Quantifying Knowledge Distillation Using Partial Information Decomposition | 提出冗余信息蒸馏(RID)框架,提升知识蒸馏在噪声教师模型下的鲁棒性和有效性。 | distillation |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 19 | A Novel Automatic Real-time Motion Tracking Method in MRI-guided Radiotherapy Using Enhanced Tracking-Learning-Detection Framework with Automatic Segmentation | 提出ETLD+ICV框架,用于MRI引导放疗中自动实时无标记运动追踪与分割 | motion tracking |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | CameraHMR: Aligning People with Perspective | CameraHMR:通过透视对齐提升单目图像人体姿态和形状估计精度 | SMPL |