| 1 |
OphMAE: Bridging Volumetric and Planar Imaging with a Foundation Model for Adaptive Ophthalmological Diagnosis |
OphMAE:利用多模态眼科影像基础模型实现自适应诊断 |
masked autoencoder metric depth foundation model |
|
|
| 2 |
Enhancing Multimodal In-Context Learning via Inductive-Deductive Reasoning |
提出基于归纳-演绎推理的多模态上下文学习框架,提升视觉-语言模型性能 |
reinforcement learning multimodal chain-of-thought |
|
|
| 3 |
Representation learning from OCT images |
综述:基于OCT图像的表征学习方法,涵盖深度学习到视觉语言模型 |
representation learning foundation model multimodal |
|
|
| 4 |
Ultrasound Vision-Language Alignment via Contrastive Learning |
提出EchoCare-CLIP,通过对比学习实现超声图像与临床文本的对齐。 |
contrastive learning foundation model |
|
|
| 5 |
Mixture Prototype Flow Matching for Open-Set Supervised Anomaly Detection |
提出混合原型流匹配(MPFM)框架,解决开放集监督异常检测中多模态建模问题。 |
flow matching |
|
|
| 6 |
FLoRA: Fusion-Latent for Optical Reconstruction and Flood Area Segmentation via Cross-Modal Multi-Task Distillation Network |
FLoRA:融合潜在空间的光学重建与洪水区域分割跨模态蒸馏网络 |
distillation |
|
|