| 1 |
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation |
提出VIF-RAG,用于提升检索增强生成系统中指令遵循对齐能力 |
large language model instruction following |
|
|
| 2 |
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models |
探索生成式AI在教育领域的应用:基于大语言模型的自动问题生成与评估 |
large language model chain-of-thought |
|
|
| 3 |
FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models |
FedEx-LoRA:通过精确聚合实现联邦学习中高效的基础模型微调 |
foundation model |
✅ |
|
| 4 |
Synthetic Knowledge Ingestion: Towards Knowledge Refinement and Injection for Enhancing Large Language Models |
提出Ski方法,通过合成知识注入提升大语言模型的知识掌握能力 |
large language model |
|
|
| 5 |
LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning |
提出LINKED方法,通过知识过滤与一致性推理提升大语言模型常识推理能力 |
large language model |
|
|
| 6 |
Enhanced Electronic Health Records Text Summarization Using Large Language Models |
利用大型语言模型Flan-T5,增强电子病历文本的聚焦式摘要生成,提升临床效率。 |
large language model |
|
|
| 7 |
Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models |
提出RAEE框架,利用大语言模型进行事件抽取的语义级重评估,解决传统精确匹配评估的局限性。 |
large language model |
|
|
| 8 |
LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models |
提出LLM×MapReduce框架,通过分治策略简化长文本处理,提升长文本理解能力。 |
large language model |
|
|
| 9 |
Are You Human? An Adversarial Benchmark to Expose LLMs |
提出对抗性基准测试,用于实时检测大型语言模型(LLM)是否伪装成人类。 |
large language model instruction following |
|
|
| 10 |
FlatQuant: Flatness Matters for LLM Quantization |
FlatQuant:通过优化权重分布,显著提升LLM量化性能。 |
large language model |
✅ |
|
| 11 |
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device |
CAMPHOR:用于多输入规划和高阶推理的设备端协作智能体框架 |
large language model |
|
|
| 12 |
Transformer-based Language Models for Reasoning in the Description Logic ALCQ |
提出基于Transformer的语言模型以提升描述逻辑ALCQ推理能力 |
large language model |
|
|
| 13 |
Extended Japanese Commonsense Morality Dataset with Masked Token and Label Enhancement |
提出MTLE方法扩展日文常识道德数据集,提升AI道德推理能力 |
large language model |
|
|
| 14 |
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models |
提出MIRAGE数据集,用于评估和解释语言模型中的归纳推理过程。 |
large language model |
|
|
| 15 |
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback |
提出FB-Bench,用于评估LLM在中文多轮对话中对人类反馈的响应能力 |
large language model |
✅ |
|
| 16 |
Rethinking Data Selection at Scale: Random Selection is Almost All You Need |
大规模SFT数据选择:随机选择性能接近最优,数据多样性至关重要 |
large language model |
|
|
| 17 |
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement |
提出COrAL:一种高效的、与顺序无关的语言模型,用于迭代优化大型语言模型。 |
large language model |
✅ |
|
| 18 |
CollabEdit: Towards Non-destructive Collaborative Knowledge Editing |
提出COLLABEDIT框架,解决大型语言模型非破坏性协同知识编辑问题 |
large language model |
✅ |
|
| 19 |
AERA Chat: An Interactive Platform for Automated Explainable Student Answer Assessment |
AERA Chat:用于自动可解释学生答案评估的交互式平台 |
large language model |
|
|
| 20 |
Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks |
提出基于参数高效微调的Q-Former视觉推理方法,显著降低训练成本。 |
large language model |
✅ |
|
| 21 |
ELICIT: LLM Augmentation via External In-Context Capability |
ELICIT:通过外部上下文能力增强LLM,无需额外训练或token。 |
large language model |
✅ |
|
| 22 |
Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation |
通过对抗扰动降低LLM在入门编程作业中的作弊行为 |
large language model |
|
|