| 1 |
Multimodal Deep Learning for Stroke Prediction and Detection using Retinal Imaging and Clinical Data |
提出多模态深度学习方法以改善中风预测与检测 |
foundation model multimodal |
|
|
| 2 |
AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation |
提出解剖本体引导推理以提升胸部X光解读能力 |
multimodal |
|
|
| 3 |
GAME: Learning Multimodal Interactions via Graph Structures for Personality Trait Estimation |
提出GAME以解决短视频中个性特征估计问题 |
multimodal |
|
|
| 4 |
DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction |
提出DeepSparse以解决稀疏视图CBCT重建中的高辐射和计算挑战 |
foundation model |
|
|
| 5 |
Detect, Classify, Act: Categorizing Industrial Anomalies with Multi-Modal Large Language Models |
提出VELM以解决工业异常分类问题 |
large language model |
|
|
| 6 |
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities |
提出统一多模态理解与生成模型以解决独立演化问题 |
multimodal |
✅ |
|
| 7 |
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction |
提出Ming-Lite-Uni以解决多模态交互统一架构问题 |
multimodal |
|
|
| 8 |
Timing Is Everything: Finding the Optimal Fusion Points in Multimodal Medical Imaging |
提出序列前向搜索算法以优化多模态医学影像融合时机 |
multimodal |
|
|
| 9 |
Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection |
提出图像-事件融合方法以解决视频异常检测中的时序信息不足问题 |
multimodal |
✅ |
|
| 10 |
Using Knowledge Graphs to harvest datasets for efficient CLIP model training |
利用知识图谱高效收集数据集以训练CLIP模型 |
foundation model |
|
|
| 11 |
RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet |
提出RGBX-DiffusionDet以解决多模态目标检测问题 |
multimodal |
|
|
| 12 |
Recent Advances in Out-of-Distribution Detection with CLIP-Like Models: A Survey |
提出基于CLIP的多模态OOD检测新框架以解决现有方法局限性 |
multimodal |
|
|
| 13 |
TeDA: Boosting Vision-Lanuage Models for Zero-Shot 3D Object Retrieval via Testing-time Distribution Alignment |
提出TeDA以解决未知类别3D物体检索问题 |
multimodal |
✅ |
|