| 1 |
Do What I Say: A Spoken Prompt Dataset for Instruction-Following |
提出DOWIS数据集,用于评估语音提示下语音大语言模型的性能。 |
large language model instruction following |
|
|
| 2 |
Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health |
通过社会决定因素探究大型语言模型中的性别刻板印象 |
large language model |
|
|
| 3 |
Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions |
提出FUSE框架,综述大语言模型合并方法、应用与未来方向。 |
large language model |
|
|
| 4 |
Benchmarking Political Persuasion Risks Across Frontier Large Language Models |
评估前沿大语言模型在政治观点影响上的风险,发现模型间说服力存在显著差异。 |
large language model |
|
|
| 5 |
Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models |
提出开放源代码模型以提取放射学报告中的肿瘤信息 |
large language model |
|
|
| 6 |
CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research? |
提出CyberThreat-Eval以解决现有CTI报告自动化不足问题 |
large language model |
✅ |
|
| 7 |
DEO: Training-Free Direct Embedding Optimization for Negation-Aware Retrieval |
提出DEO:一种免训练的直接嵌入优化方法,用于处理包含否定信息的检索任务。 |
large language model multimodal |
|
|
| 8 |
You Didn't Have to Say It like That: Subliminal Learning from Faithful Paraphrases |
通过语义不变的释义文本,语言模型可隐式学习教师模型的偏好。 |
chain-of-thought |
|
|
| 9 |
RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation |
RbtAct:利用同行评审反驳意见,生成更具可操作性的评审反馈 |
large language model |
|
|
| 10 |
ESAinsTOD: A Unified End-to-End Schema-Aware Instruction-Tuning Framework for Task-Oriented Dialog Modeling |
提出ESAinsTOD,一个统一的、端到端的、模式感知的指令调优框架,用于任务型对话建模。 |
large language model |
|
|
| 11 |
Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs |
揭示LLM叙事焦点偏差:道德推理优先于常识理解 |
large language model |
|
|
| 12 |
One-Eval: An Agentic System for Automated and Traceable LLM Evaluation |
One-Eval:一个自动化、可溯源的 Agentic LLM 评估系统 |
large language model |
✅ |
|
| 13 |
Beyond Fine-Tuning: Robust Food Entity Linking under Ontology Drift with FoodOntoRAG |
FoodOntoRAG:一种无需微调的、鲁棒的食品实体链接方法,可应对本体漂移。 |
large language model |
|
|
| 14 |
Evaluation of LLMs in retrieving food and nutritional context for RAG systems |
利用LLM驱动的RAG系统,高效检索食品营养数据,降低领域专家使用门槛。 |
large language model |
|
|
| 15 |
ALARM: Audio-Language Alignment for Reasoning Models |
ALARM:通过音频-语言对齐增强推理模型的音频理解能力 |
chain-of-thought |
|
|
| 16 |
Quantifying and extending the coverage of spatial categorization data sets |
利用大型语言模型扩展空间范畴数据集,提升场景覆盖率 |
large language model |
|
|
| 17 |
TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA |
提出TA-Mem框架,增强LLM在长期对话问答中自主记忆检索能力 |
large language model |
|
|
| 18 |
Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety |
Bioalignment:通过微调提升LLM对生物系统的倾向性,增强AI安全性 |
large language model |
|
|