cs.CV(2024-11-13)
📊 共 24 篇论文 | 🔗 6 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2)
支柱三:空间感知与语义 (Perception & Semantics) (6 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (6 🔗1)
支柱五:交互与反应 (Interaction & Reaction) (1)
支柱一:机器人控制 (Robot Control) (1)
支柱四:生成式动作 (Generative Motion) (1 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | MBA-SLAM: Motion Blur Aware Gaussian Splatting SLAM | 提出MBA-SLAM,解决运动模糊场景下的高精度SLAM问题,提升相机定位和地图重建质量。 | 3D gaussian splatting 3DGS gaussian splatting | ✅ | |
| 10 | 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization | 提出不确定性感知正则化的4D高斯溅射,用于野生单目视频动态场景重建 | gaussian splatting splatting scene reconstruction | ||
| 11 | Biomass phenotyping of oilseed rape through UAV multi-view oblique imaging with 3DGS and SAM model | 结合3DGS与SAM模型,实现油菜高精度三维重建与生物量表型分析 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 12 | BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis | BBSplat:基于可学习纹理图元的 novel view synthesis 方法 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 13 | Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models | 提出基于傅里叶谱幅度特征提取的无监督方法,提升神经渲染图像伪造检测的准确性。 | 3D gaussian splatting gaussian splatting splatting | ||
| 14 | OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Fused Geometric and Semantic Guidance | OSMLoc:融合几何与语义引导的单图像OpenStreetMap视觉定位 | depth estimation monocular depth scene understanding | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Multimodal Instruction Tuning with Hybrid State Space Models | 提出混合Transformer-MAMBA模型,高效处理多模态长上下文输入。 | Mamba state space model large language model | ||
| 16 | MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking | 提出MambaXCTrack以解决超声针头跟踪中的可见性问题 | Mamba SSM state space model | ||
| 17 | EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation | 提出EgoVid-5M大规模第一人称视频数据集,用于提升主观视角视频生成效果。 | dreamer egocentric | ||
| 18 | Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment | 提出CSFIQA框架,利用选择性注意力与对比学习提升盲图像质量评估性能 | contrastive learning | ||
| 19 | Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head | 提出双头知识蒸馏(DHKD),解决logits信息利用不充分及分类头坍塌问题。 | distillation | ✅ | |
| 20 | A survey on Graph Deep Representation Learning for Facial Expression Recognition | 综述:图深度表示学习在面部表情识别中的应用 | representation learning |
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 21 | CoMiX: Cross-Modal Fusion with Deformable Convolutions for HSI-X Semantic Segmentation | 提出CoMiX,利用可变形卷积进行跨模态融合,提升高光谱图像语义分割性能。 | HSI multimodal |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 22 | A Survey on Vision Autoregressive Model | 综述视觉自回归模型,涵盖图像、视频生成及多模态统一生成等任务。 | manipulation motion generation multimodal |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 23 | Motion Control for Enhanced Complex Action Video Generation | MVideo:提出一种基于掩码序列运动控制的复杂动作视频生成框架 | motion generation | ✅ |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 24 | MikuDance: Animating Character Art with Mixed Motion Dynamics | MikuDance:融合混合运动动态的角色艺术动画生成扩散模型 | motion tracking |