cs.CV(2024-05-07)
📊 共 19 篇论文 | 🔗 5 篇有代码
🎯 兴趣领域导航
支柱三:空间感知与语义 (Perception & Semantics) (6 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱一:机器人控制 (Robot Control) (2 🔗2)
支柱八:物理动画 (Physics-based Animation) (1 🔗1)
支柱五:交互与反应 (Interaction & Reaction) (1)
支柱七:动作重定向 (Motion Retargeting) (1)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Novel View Synthesis with Neural Radiance Fields for Industrial Robot Applications | 提出基于机器人运动学的NeRF新视角合成方法,用于工业机器人应用 | NeRF neural radiance field scene reconstruction | ||
| 2 | DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid | 提出DistGrid,基于分布式多分辨率哈希网格实现大规模场景重建。 | NeRF neural radiance field scene reconstruction | ||
| 3 | Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | Edit-Your-Motion:时空解耦扩散学习用于视频运动编辑,解决泛化性差问题。 | implicit representation human motion | ||
| 4 | Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar | 提出Radar Fields,用于FMCW雷达的频域神经场景表示,实现恶劣天气下的场景重建。 | scene reconstruction | ||
| 5 | Tactile-Augmented Radiance Fields | 提出触觉增强辐射场(TaRF),融合视觉与触觉信息,用于场景三维重建与感知。 | neural radiance field | ✅ | |
| 6 | Light Field Compression Based on Implicit Neural Representation | 提出基于隐式神经表示的光场压缩方案,有效降低视图间冗余。 | implicit representation |
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | DriveWorld:通过世界模型进行自动驾驶的4D预训练场景理解 | world model latent dynamics representation learning | ||
| 8 | VMambaCC: A Visual State Space Model for Crowd Counting | 提出VMambaCC模型,利用视觉状态空间模型解决人群计数问题 | Mamba state space model | ||
| 9 | Vision Mamba: A Comprehensive Survey and Taxonomy | 对视觉领域Mamba模型进行全面综述与分类,旨在促进其在视觉任务中的应用。 | Mamba SSM state space model | ✅ | |
| 10 | ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation | 提出ELiTe,通过高效图像-激光雷达知识迁移提升语义分割性能 | representation learning distillation foundation model | ||
| 11 | Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing | 提出DARLING框架,解耦场景文本图像的风格与内容特征,提升识别、移除和编辑性能 | representation learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Leveraging Medical Foundation Model Features in Graph Neural Network-Based Retrieval of Breast Histopathology Images | 利用医学预训练模型特征,提出基于图神经网络的乳腺组织病理图像检索方法 | foundation model | ||
| 13 | Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation | Sign2GPT:利用大型语言模型实现无词汇的口语翻译 | large language model | ||
| 14 | Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks | 视觉指令调优使LLM更易受攻击,损害了其安全性。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos | 提出Diff-IP2D,利用扩散模型预测第一视角视频中的手-物交互,解决单向预测误差累积问题。 | manipulation affordance egocentric | ✅ | |
| 16 | SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing | SEED-Data-Edit:一个用于指令式图像编辑的混合数据集,提升图像操作的灵活性。 | manipulation large language model multimodal | ✅ |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers | ViewFormer:利用视角引导Transformer探索多视角3D Occupancy感知的时空建模 | spatiotemporal | ✅ |
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 18 | ChatHuman: Chatting about 3D Humans with Tools | 提出ChatHuman以解决3D人类任务分析的复杂性问题 | human-object interaction large language model |
🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 19 | Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling | 提出时序平滑的Procrustean对齐和空间变异形变建模,解决非刚性SfM问题。 | motion recovery |