| 1 |
TeleTables: A Benchmark for Large Language Models in Telecom Table Interpretation |
TeleTables:用于评估大语言模型在电信表格理解能力上的基准数据集 |
large language model multimodal |
|
|
| 2 |
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG |
提出M4-RAG,一个大规模多语言多文化多模态RAG基准,用于评估跨语言和模态的检索增强VQA。 |
multimodal |
|
|
| 3 |
Optimizing Medical Question-Answering Systems: A Comparative Study of Fine-Tuned and Zero-Shot Large Language Models with RAG Framework |
提出基于RAG的医学问答系统,结合微调LLM提升准确率并减少幻觉 |
large language model |
|
|
| 4 |
Interleaved Latent Visual Reasoning with Selective Perceptual Modeling |
提出ILVR框架,通过交错潜在视觉推理提升多模态大语言模型性能。 |
large language model multimodal |
✅ |
|
| 5 |
ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering |
提出ArtistMus:一个以艺术家为中心的、全局多样性的音乐问答检索增强基准。 |
large language model multimodal |
|
|
| 6 |
Structured Reasoning with Tree-of-Thoughts for Bengali Math Word Problems |
提出树状思维结构以解决孟加拉数学文字问题 |
large language model chain-of-thought |
|
|
| 7 |
Exposing Pink Slime Journalism: Linguistic Signatures and Robust Detection Against LLM-Generated Threats |
针对LLM生成的新型粉红泥新闻,提出一种鲁棒的检测框架,提升检测性能。 |
large language model |
|
|
| 8 |
Efficient Text Classification with Conformal In-Context Learning |
提出Conformal In-Context Learning (CICLe)框架,提升LLM文本分类效率与泛化性。 |
large language model |
|
|
| 9 |
Attribute-Aware Controlled Product Generation with LLMs for E-commerce |
提出一种基于LLM的属性感知控制产品生成框架,用于电商数据增强。 |
large language model |
|
|
| 10 |
Dynamic Alignment for Collective Agency: Toward a Scalable Self-Improving Framework for Open-Ended LLM Alignment |
提出动态对齐框架,实现LLM在集体能动性上的可扩展自改进对齐 |
large language model |
|
|
| 11 |
A Greek Government Decisions Dataset for Public-Sector Analysis and Insight |
构建希腊政府决策数据集,并探索其在公共部门信息检索与推理中的应用。 |
large language model |
|
|
| 12 |
SEA-SafeguardBench: Evaluating AI Safety in SEA Languages and Cultures |
SEA-SafeguardBench:评估东南亚语言和文化背景下的人工智能安全性 |
large language model |
|
|
| 13 |
LMSpell: Neural Spell Checking for Low-Resource Languages |
提出LMSpell,用于低资源语言的神经拼写检查工具包 |
large language model |
|
|
| 14 |
SQ-format: A Unified Sparse-Quantized Hardware-friendly Data Format for LLMs |
提出SQ-format:一种统一的稀疏量化硬件友好型LLM数据格式 |
large language model |
|
|