| 1 |
FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models |
提出FaceShield以解决面部反欺骗问题并增强可解释性 |
large language model multimodal |
✅ |
|
| 2 |
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset |
BLIP3-o:全开放统一多模态模型族,架构、训练与数据集的全面研究 |
multimodal |
|
|
| 3 |
Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing |
提出MMDA框架,通过多模态去噪与对齐提升跨域人脸反欺骗泛化能力 |
multimodal |
|
|
| 4 |
BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis |
BioVFM:构建并扩展生物医学图像分析的自监督视觉基础模型 |
foundation model |
|
|
| 5 |
Zero-Shot Multi-modal Large Language Model v.s. Supervised Deep Learning: A Comparative Study on CT-Based Intracranial Hemorrhage Subtyping |
对比研究:零样本多模态大语言模型在CT图像颅内出血分型中表现不如监督深度学习 |
large language model |
|
|
| 6 |
Bias and Generalizability of Foundation Models across Datasets in Breast Mammography |
研究乳腺钼靶影像中预训练模型的偏见与泛化性,提出公平性感知方法。 |
foundation model |
|
|
| 7 |
Relative Drawing Identification Complexity is Invariant to Modality in Vision-Language Models |
研究表明视觉-语言模型中绘图识别的复杂性在不同模态间具有不变性 |
large language model multimodal |
|
|
| 8 |
AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection |
提出基于AI分割的电路网络检测方法,构建大规模AMS电路数据库AMSnet 2.0。 |
large language model multimodal |
|
|
| 9 |
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning |
MetaUAS:基于单样本元学习的通用异常分割,无需视觉-语言模型。 |
foundation model |
✅ |
|
| 10 |
Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation |
提出基于少量样本异常驱动生成的异常检测与分割方法,提升工业质检性能。 |
foundation model |
✅ |
|
| 11 |
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models |
提出基于对比类对齐分数的自动提示优化方法,提升视觉-语言模型的目标检测精度。 |
large language model |
|
|