cs.CV(2024-08-17)
📊 共 12 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱三:空间感知与语义 (Perception & Semantics) (5)
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱九:具身大模型 (Embodied Foundation Models) (2 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting | Gaussian-DK:利用高斯溅射从不一致的黑暗图像中进行实时视角合成 | 3D gaussian splatting gaussian splatting splatting | ||
| 2 | Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | 提出LAE-DINO模型,解决遥感图像开放词汇目标检测中的领域泛化难题 | open-vocabulary open vocabulary | ||
| 3 | HybridOcc: NeRF Enhanced Transformer-based Multi-Camera 3D Occupancy Prediction | HybridOcc:NeRF增强的Transformer多相机3D Occupancy预测 | NeRF | ||
| 4 | GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System | GSLAMOT:提出基于轨迹片段和查询图的同步定位、建图与多目标跟踪系统 | semantic map multimodal | ||
| 5 | GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation | GoodSAM++:利用SAM弥合领域和容量差距,实现全景语义分割 | semantic map |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model | 提出基于Mamba状态空间模型的多目标跟踪简单基线MambaTrack,解决非线性运动跟踪难题。 | Mamba SSM state space model | ||
| 7 | Zero-Shot Object-Centric Representation Learning | 提出零样本目标中心表示学习框架,提升模型在未见数据集上的物体发现能力。 | representation learning foundation model zero-shot transfer | ||
| 8 | SSNeRF: Sparse View Semi-supervised Neural Radiance Fields with Augmentation | SSNeRF:基于增广的稀疏视角半监督神经辐射场,提升少样本视角下的NeRF重建质量。 | teacher-student NeRF neural radiance field | ||
| 9 | DRL-Based Resource Allocation for Motion Blur Resistant Federated Self-Supervised Learning in IoV | 提出基于DRL的资源分配方案,用于IoV中抗运动模糊的联邦自监督学习。 | reinforcement learning deep reinforcement learning DRL |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Are CLIP features all you need for Universal Synthetic Image Origin Attribution? | 利用CLIP特征进行通用合成图像溯源,解决开放集场景下的模型归属问题 | foundation model | ✅ | |
| 11 | Segment Anything with Multiple Modalities | MM-SAM:扩展SAM以支持多模态数据分割,提升各种传感器下的鲁棒性。 | foundation model |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Flatten: Video Action Recognition is an Image Classification task | Flatten:将视频动作识别转化为图像分类任务,提升效率与性能 | spatiotemporal |