cs.CV(2025-01-07)
📊 共 23 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (7 🔗3)
支柱三:空间感知与语义 (Perception & Semantics) (4)
支柱一:机器人控制 (Robot Control) (2)
支柱六:视频提取与匹配 (Video Extraction) (1)
支柱七:动作重定向 (Motion Retargeting) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting | ConcealGS:提出一种在3D高斯溅射中隐藏不可见版权信息的方法 | distillation 3D gaussian splatting gaussian splatting | ||
| 10 | CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds | CL3DOR:通过高分辨率点云上的优势比对比学习提升3D大型多模态模型 | contrastive learning scene understanding large language model | ||
| 11 | NeuralSVG: An Implicit Representation for Text-to-Vector Generation | NeuralSVG:提出一种基于隐式表达的文本到矢量图形生成方法,提升结构化和灵活性的SVG生成效果。 | distillation NeRF neural radiance field | ||
| 12 | LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving | LargeAD:面向自动驾驶的大规模跨传感器数据预训练框架 | representation learning contrastive learning scene understanding | ||
| 13 | Cosmos World Foundation Model Platform for Physical AI | NVIDIA 提出 Cosmos 世界基础模型平台,助力物理人工智能构建定制化世界模型 | world model foundation model | ✅ | |
| 14 | Information-Maximized Soft Variable Discretization for Self-Supervised Image Representation Learning | 提出信息最大化软变量离散化(IMSVD)的自监督图像表征学习方法 | representation learning contrastive learning foundation model | ✅ | |
| 15 | An Empirical Study of Accuracy-Robustness Tradeoff and Training Efficiency in Self-Supervised Learning | 提出CF-AMC-SSL,在自监督学习中加速收敛,提升精度和鲁棒性的平衡。 | representation learning contrastive learning | ✅ |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 16 | DehazeGS: Seeing Through Fog with 3D Gaussian Splatting | 提出DehazeGS,利用3D高斯溅射实现雾天图像的去雾和高质量新视角合成。 | 3D gaussian splatting gaussian splatting splatting | ||
| 17 | MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting | MoDec-GS:面向复杂动态场景,提出全局到局部运动分解的紧凑型动态3D高斯溅射方法 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 18 | NeRFs are Mirror Detectors: Using Structural Similarity for Multi-View Mirror Scene Reconstruction with 3D Surface Primitives | NeRF-MD:利用结构相似性进行多视角镜面场景三维表面重建 | NeRF neural radiance field scene reconstruction | ||
| 19 | ZDySS -- Zero-Shot Dynamic Scene Stylization using Gaussian Splatting | 提出ZDySS,利用高斯溅射实现动态场景的零样本风格迁移 | gaussian splatting splatting |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | Advancing the Understanding of Fine-Grained 3D Forest Structures using Digital Cousins and Simulation-to-Reality: Methods and Datasets | 提出基于数字孪生和Sim2Real的森林三维结构合成框架,并构建了大规模森林点云数据集Boreal3D。 | sim2real scene understanding | ||
| 21 | Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control | DaS:利用3D感知视频扩散模型实现多功能视频生成控制 | manipulation |
🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 22 | Graph-Based Multimodal and Multi-view Alignment for Keystep Recognition | 提出基于图学习的多模态多视角对齐框架,用于提升第一人称视角视频中的关键步骤识别精度。 | egocentric multimodal |
🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 23 | Extraction Of Cumulative Blobs From Dynamic Gestures | 提出基于夜视摄像头的动态手势识别方法,解决光照不足环境下的手势交互问题 | human motion |