| 1 |
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models |
提出LongHalQA,用于评估多模态大语言模型在长文本场景下的幻觉问题 |
large language model multimodal |
✅ |
|
| 2 |
MIRAGE: Multimodal Identification and Recognition of Annotations in Indian General Prescriptions |
MIRAGE:利用多模态大模型识别印度通用处方中的手写体标注 |
large language model multimodal |
|
|
| 3 |
Data Adaptive Few-shot Multi Label Segmentation with Foundation Model |
提出基于Foundation Model的数据自适应少样本多标签分割方法,提升医学图像分割性能。 |
foundation model |
|
|
| 4 |
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models |
提出LOKI:一个使用大型多模态模型进行综合性合成数据检测的基准。 |
multimodal |
✅ |
|
| 5 |
Text4Seg: Reimagining Image Segmentation as Text Generation |
Text4Seg:将图像分割重构为文本生成任务,简化分割流程。 |
large language model multimodal |
|
|
| 6 |
Surgical-LLaVA: Toward Surgical Scenario Understanding via Large Language and Vision Models |
Surgical-LLaVA:通过大型语言和视觉模型实现手术场景理解 |
large language model instruction following |
|
|
| 7 |
UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation |
提出UnSeg,利用通用不可学习噪声生成器对抗图像分割模型 |
foundation model |
|
|
| 8 |
Robust 3D Point Clouds Classification based on Declarative Defenders |
提出基于声明式防御的鲁棒3D点云分类方法,提升对抗攻击下的性能。 |
foundation model |
✅ |
|