| 1 |
A Survey on Evaluation of Multimodal Large Language Models |
综述多模态大语言模型评测方法,促进更可靠的通用人工智能发展 |
large language model multimodal |
|
|
| 2 |
Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models |
提出DC$^2$框架,无需训练即可提升MLLM对高分辨率图像的感知能力。 |
large language model multimodal |
|
|
| 3 |
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders |
Eagle:探索混合编码器在多模态大语言模型中的设计空间 |
large language model multimodal |
|
|
| 4 |
Does Data-Efficient Generalization Exacerbate Bias in Foundation Models? |
研究表明数据高效的通用化可能加剧Foundation模型中的偏见 |
foundation model |
|
|
| 5 |
Using Backbone Foundation Model for Evaluating Fairness in Chest Radiography Without Demographic Data |
利用主干基础模型在无人口统计数据情况下评估胸部X光片的公平性 |
foundation model |
|
|
| 6 |
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models |
利用开放知识提升大语言模型在特定任务上的专业能力 |
large language model |
✅ |
|
| 7 |
SITransformer: Shared Information-Guided Transformer for Extreme Multimodal Summarization |
SITransformer:提出共享信息引导的Transformer用于极限多模态摘要生成 |
multimodal |
✅ |
|
| 8 |
Benchmarking foundation models as feature extractors for weakly-supervised computational pathology |
通过基准测试病理学Foundation模型,用于弱监督计算病理学特征提取。 |
foundation model |
|
|
| 9 |
More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding |
提出GreenPLM,利用更多文本数据提升3D数据稀缺场景下的点云-语言理解能力 |
large language model |
✅ |
|
| 10 |
CSAD: Unsupervised Component Segmentation for Logical Anomaly Detection |
提出CSAD:一种无监督组件分割方法,用于提升逻辑异常检测性能。 |
foundation model |
|
|
| 11 |
TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning |
TagOOD:利用视觉-语言表征和类中心学习实现新颖的分布外检测方法 |
multimodal |
|
|
| 12 |
Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input |
Kangaroo:一种支持长上下文视频输入的强大视频语言模型 |
large language model |
|
|