| 1 |
Question Rephrasing for Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks |
提出基于问题复述的LLM不确定性量化方法,应用于分子化学任务 |
large language model |
|
|
| 2 |
Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity |
提出上下文感知 grounding 框架,提升大型语言模型可靠性与伦理对齐 |
large language model |
|
|
| 3 |
Large Language Models for Biomedical Text Simplification: Promising But Not There Yet |
利用大型语言模型进行生物医学文本简化研究:有前景但仍需努力 |
large language model |
|
|
| 4 |
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models |
WalledEval:为大型语言模型提供全面的安全评估工具包 |
large language model |
✅ |
|
| 5 |
StructuredRAG: JSON Response Formatting with Large Language Models |
StructuredRAG:提出用于评估LLM生成JSON结构化输出能力的基准测试 |
large language model |
|
|
| 6 |
Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It? |
提出Prompt-and-Select方法,利用大语言模型生成更易于噪声环境下理解的语音释义。 |
large language model |
|
|
| 7 |
EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora |
提出EgyBERT以解决埃及方言处理问题 |
large language model |
✅ |
|
| 8 |
SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature |
提出SLIM-RAFT,提升LLM在Mercosur通用命名规范跨语言任务中的性能 |
large language model chain-of-thought |
|
|
| 9 |
EXAONE 3.0 7.8B Instruction Tuned Language Model |
LG AI Research发布EXAONE 3.0 7.8B指令调优语言模型 |
large language model instruction following |
✅ |
|
| 10 |
Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation |
Optimus通过挖掘LLM训练中的GPU气泡,加速大规模多模态LLM训练。 |
large language model multimodal |
|
|
| 11 |
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond |
提出 Speech-MASSIVE,一个用于多语言语音理解及其他任务的大规模语音数据集。 |
foundation model multimodal |
✅ |
|
| 12 |
Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection |
提出基于视觉提示注入的目标劫持攻击方法,揭示大型视觉语言模型的安全风险 |
instruction following |
|
|
| 13 |
ConfReady: A RAG based Assistant and Dataset for Conference Checklist Responses |
ConfReady:一个基于RAG的助手和数据集,用于生成会议论文检查清单回复 |
large language model |
|
|
| 14 |
Identifying and Mitigating Social Bias Knowledge in Language Models |
提出 Fairness Stamp (FAST) 方法,用于识别和缓解语言模型中的社会偏见知识。 |
large language model |
|
|
| 15 |
Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models |
提出基于对抗提示和LLM评判的自动化方法,用于检测语言模型中的性别偏见。 |
large language model |
|
|
| 16 |
Prompt and Prejudice |
通过在伦理决策任务中添加姓名,揭示LLM/VLM中的人口偏见 |
large language model |
|
|
| 17 |
NatLan: Native Language Prompting Facilitates Knowledge Elicitation Through Language Trigger Provision and Domain Trigger Retention |
提出NatLan,通过母语提示增强多语言大模型在非优势语言上的知识获取能力。 |
large language model |
✅ |
|
| 18 |
Forecasting Live Chat Intent from Browsing History |
提出一种两阶段方法,通过浏览历史预测在线客服的用户意图。 |
large language model |
|
|
| 19 |
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time |
NACL:一种通用高效的LLM推理时KV缓存淘汰框架 |
large language model |
✅ |
|
| 20 |
PAGED: A Benchmark for Procedural Graphs Extraction from Documents |
提出PAGED基准,用于评估和提升文档中程序图的自动抽取能力 |
large language model |
|
|
| 21 |
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data |
1.5-Pints:通过高质量数据,在数天内完成语言模型预训练,性能超越现有模型。 |
instruction following |
|
|