cs.CV(2025-05-28)
📊 共 8 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (3)
支柱三:空间感知与语义 (Perception & Semantics) (2 🔗2)
支柱九:具身大模型 (Embodied Foundation Models) (2)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Research on Driving Scenario Technology Based on Multimodal Large Lauguage Model Optimization | 提出多模态模型优化方法以解决复杂驾驶场景理解问题 | distillation multimodal | ||
| 2 | CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation | 提出CAST框架以解决半监督实例分割问题 | distillation foundation model | ||
| 3 | InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective | 提出InfoSAM以提升SAM在专业领域的表现 | distillation foundation model |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | Diffusion-Denoised Hyperspectral Gaussian Splatting | 提出扩散去噪的高光谱高斯点云方法以解决高光谱成像重建问题 | 3D gaussian splatting 3DGS gaussian splatting | ✅ | |
| 5 | Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation | 提出多级多提示熵最小化方法以解决开放词汇语义分割问题 | open-vocabulary open vocabulary | ✅ |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | OSPO: Object-centric Self-improving Preference Optimization for Text-to-Image Generation | 提出OSPO框架以解决文本到图像生成中的对象对齐问题 | large language model multimodal | ||
| 7 | Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task | 提出MultiStAR基准以解决抽象视觉推理评估问题 | large language model multimodal |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | UniMoGen: Universal Motion Generation | 提出UniMoGen以解决骨架依赖的运动生成问题 | motion generation character animation |