cs.CV(2025-09-21)
📊 共 20 篇论文 | 🔗 6 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (9 🔗3)
支柱三:空间感知与语义 (Perception & Semantics) (6 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (6 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 16 | ME-Mamba: Multi-Expert Mamba with Efficient Knowledge Capture and Fusion for Multimodal Survival Analysis | 提出ME-Mamba,用于高效融合病理图像和基因组数据的多模态生存分析。 | Mamba multimodal | ||
| 17 | VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery | 提出VaseVQA基准和VaseVL模型,用于提升多模态大模型在古希腊陶器领域的专家级理解能力。 | reinforcement learning multimodal | ✅ | |
| 18 | From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning | 提出MIR基准测试,用于评估多图交错推理中多模态大语言模型的能力。 | curriculum learning large language model | ||
| 19 | PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion | PRISM:通过生成扩散模型实现精确召回指导的无数据知识蒸馏 | distillation | ||
| 20 | Learning from Gene Names, Expression Values and Images: Contrastive Masked Text-Image Pretraining for Spatial Transcriptomics Representation Learning | 提出CoMTIP框架,用于空间转录组学中基于对比Masked Text-Image预训练的表征学习。 | representation learning |