cs.CV(2024-10-16)
📊 共 10 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (3)
支柱三:空间感知与语义 (Perception & Semantics) (2)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio | 提出CMM基准,系统评估大型多模态模型在语言、视觉和音频上的幻觉问题。 | multimodal | ||
| 2 | VividMed: Vision Language Model with Versatile Visual Grounding for Medicine | VividMed:面向医学领域,具备多功能视觉定位的视觉语言模型 | visual grounding | ✅ | |
| 3 | Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2 | 利用Llama-2自动映射医学影像报告中的解剖标志,提升医疗影像工作流效率 | large language model | ||
| 4 | DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception | DocLayout-YOLO:通过多样合成数据和自适应感受野增强文档布局分析 | multimodal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | MambaBEV: An efficient 3D detection model with Mamba2 | MambaBEV:利用Mamba2提升BEV视角3D目标检测的效率与精度 | Mamba SSM state space model | ||
| 6 | GAN Based Top-Down View Synthesis in Reinforcement Learning Environments | 提出基于GAN的自顶向下视图合成方法,用于增强强化学习环境中的智能体感知。 | reinforcement learning first-person view | ||
| 7 | MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization | MuVi:提出一种基于语义对齐和节奏同步的视频到音乐生成框架 | flow matching visual pre-training |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | EG-HumanNeRF: Efficient Generalizable Human NeRF Utilizing Human Prior for Sparse View | 提出EG-HumanNeRF,利用人体先验知识,高效生成稀疏视角下高质量可泛化的人体NeRF模型。 | NeRF neural radiance field | ||
| 9 | Radon Implicit Field Transform (RIFT): Learning Scenes from Radar Signals | 提出Radon隐式场变换(RIFT),利用雷达信号学习场景表示,降低数据采集成本。 | scene reconstruction |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | UniCoN: Universal Conditional Networks for Multi-Age Embryonic Cartilage Segmentation with Sparsely Annotated Data | UniCoN:通用条件网络,用于稀疏标注数据下的多年龄胚胎软骨分割 | UniCon |