cs.CV(2026-05-04)

📊 共 11 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (6) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
1 OphMAE: Bridging Volumetric and Planar Imaging with a Foundation Model for Adaptive Ophthalmological Diagnosis OphMAE:利用多模态眼科影像基础模型实现自适应诊断 masked autoencoder metric depth foundation model
2 Enhancing Multimodal In-Context Learning via Inductive-Deductive Reasoning 提出基于归纳-演绎推理的多模态上下文学习框架,提升视觉-语言模型性能 reinforcement learning multimodal chain-of-thought
3 Representation learning from OCT images 综述:基于OCT图像的表征学习方法,涵盖深度学习到视觉语言模型 representation learning foundation model multimodal
4 Ultrasound Vision-Language Alignment via Contrastive Learning 提出EchoCare-CLIP,通过对比学习实现超声图像与临床文本的对齐。 contrastive learning foundation model
5 Mixture Prototype Flow Matching for Open-Set Supervised Anomaly Detection 提出混合原型流匹配(MPFM)框架,解决开放集监督异常检测中多模态建模问题。 flow matching
6 FLoRA: Fusion-Latent for Optical Reconstruction and Flood Area Segmentation via Cross-Modal Multi-Task Distillation Network FLoRA:融合潜在空间的光学重建与洪水区域分割跨模态蒸馏网络 distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
7 MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings MultiSense-Pneumo:面向资源受限场景的多模态肺炎筛查框架 multimodal
8 Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score 提出统一质量评分UQS,解决多模态机器遗忘评估指标不一致问题 multimodal
9 Rethinking Electro-Optical Vision Foundation Models for Remote Sensing Retrieval: A Controlled Comparison with Generalist VFM 对比通用视觉模型,评估遥感检索中专用电光视觉基础模型的有效性 foundation model
10 ViewSAM: Learning View-aware Cross-modal Semantics for Weakly Supervised Cross-view Referring Multi-Object Tracking 提出ViewSAM,利用弱监督跨视角语义学习解决跨视角指代表多目标跟踪问题。 foundation model
11 FEAT: Fashion Editing and Try-On from Any Design FEAT:利用任意设计进行服装编辑和试穿,扩展设计来源并支持完整搭配。 multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页