cs.CV(2024-12-06)
📊 共 27 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (7 🔗3)
支柱三:空间感知与语义 (Perception & Semantics) (6 🔗1)
支柱一:机器人控制 (Robot Control) (3)
支柱六:视频提取与匹配 (Video Extraction) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization | 提出SoPo:一种半在线偏好优化的文本到动作生成方法 | DPO MDM text-to-motion | ✅ | |
| 11 | Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction | 提出动量高斯自蒸馏方法,用于高质量大规模场景重建。 | distillation 3D gaussian splatting gaussian splatting | ✅ | |
| 12 | EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation | EACO:通过关键观察增强多模态LLM的对齐能力 | DPO direct preference optimization large language model | ||
| 13 | PanoDreamer: Optimization-Based Single Image to 360 3D Scene With Diffusion | PanoDreamer:基于扩散模型的单图到360°三维场景优化方法 | dreamer depth estimation scene reconstruction | ||
| 14 | Birth and Death of a Rose | 利用预训练2D扩散模型,生成随时间演变的物体内在属性,如玫瑰花开。 | distillation foundation model | ||
| 15 | Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection | 提出类感知对比学习,解决多类异常检测中的类间混淆问题 | contrastive learning | ||
| 16 | SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images | SimC3D:提出一种基于RGB图像的简单对比3D预训练框架,提升下游任务性能。 | contrastive learning depth estimation | ✅ |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | Pushing Rendering Boundaries: Hard Gaussian Splatting | 提出Hard Gaussian Splatting,解决3DGS中伪影问题,提升新视角合成质量。 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 18 | MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussians | 提出MixedGaussianAvatar,通过混合2D-3D高斯实现逼真且几何精确的头部Avatar重建 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 19 | $S^3$: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models | 提出同义语义空间($S^3$),提升视觉-语言模型零样本泛化能力 | open-vocabulary open vocabulary large language model | ||
| 20 | Extrapolated Urban View Synthesis Benchmark | 提出EUVS基准,用于评估城市场景下外推视角合成算法的泛化能力。 | 3D gaussian splatting gaussian splatting splatting | ||
| 21 | Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories | Perturb-and-Revise:基于生成轨迹的灵活NeRF 3D编辑方法 | NeRF | ✅ | |
| 22 | Spatially-Adaptive Hash Encodings For Neural Surface Reconstruction | 提出空间自适应哈希编码,用于神经表面重建,实现更高精度几何恢复。 | scene reconstruction |
🔬 支柱一:机器人控制 (Robot Control) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 23 | BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects | BimArt:一种用于合成3D双手与铰接物体交互的统一方法 | manipulation bi-manual | ||
| 24 | DreamColour: Controllable Video Colour Editing without Training | DreamColour:提出一种免训练的可控视频色彩编辑框架,提升编辑质量与效率。 | manipulation | ||
| 25 | How to Squeeze An Explanation Out of Your Model | 提出基于SE模块的模型无关可解释性方法,适用于图像和视频/多模态生物特征识别。 | manipulation |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 26 | GS-Matching: Reconsidering Feature Matching task in Point Cloud Registration | 提出GS-Matching策略,解决点云配准中特征匹配的非最优问题 | feature matching |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 27 | CigTime: Corrective Instruction Generation Through Inverse Motion Editing | CigTime:通过逆运动编辑生成纠正性指令,用于运动技能学习。 | motion generation large language model |