| 1 |
ScImage: How Good Are Multimodal Large Language Models at Scientific Text-to-Image Generation? |
ScImage:评估多模态大语言模型在科学文本到图像生成任务中的性能。 |
large language model multimodal |
|
|
| 2 |
Large Multimodal Agents for Accurate Phishing Detection with Enhanced Token Optimization and Cost Reduction |
提出基于多模态大模型的两阶段Agent架构,用于低成本高精度钓鱼网站检测 |
multimodal |
|
|
| 3 |
Does Few-Shot Learning Help LLM Performance in Code Synthesis? |
提出两种少样本选择方法,提升LLM在代码生成任务中的性能 |
large language model chain-of-thought |
|
|
| 4 |
Cosmos-LLaVA: Chatting with the Visual Cosmos-LLaVA: Görselle Sohbet Etmek |
Cosmos-LLaVA:构建土耳其语视觉指令模型,提升多模态对话能力 |
large language model |
|
|
| 5 |
The Asymptotic Behavior of Attention in Transformers |
揭示Transformer深度增加时的注意力机制渐近行为,证明Token将收敛到单一簇 |
large language model |
|
|