| 1 |
Leveraging Large Language Models for Solving Rare MIP Challenges |
利用大语言模型解决罕见混合整数规划难题 |
large language model chain-of-thought |
|
|
| 2 |
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation |
Booster:通过衰减有害扰动应对大语言模型中的有害微调攻击 |
large language model |
✅ |
|
| 3 |
AdaComp: Extractive Context Compression with Adaptive Predictor for Retrieval-Augmented Large Language Models |
AdaComp:基于自适应预测器的抽取式上下文压缩,提升检索增强大语言模型效率 |
large language model |
|
|
| 4 |
S^3cMath: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners |
提出S^3c-Math,赋予大语言模型自发式步骤级数学推理自纠错能力 |
large language model |
|
|
| 5 |
FuzzCoder: Byte-level Fuzzing Test via Large Language Model |
提出FuzzCoder,利用微调大语言模型指导字节级模糊测试,提升漏洞发现效率。 |
large language model |
|
|
| 6 |
Towards Leveraging Large Language Models for Automated Medical Q&A Evaluation |
利用大语言模型自动评估医疗问答系统的响应 |
large language model |
|
|
| 7 |
LLM-GAN: Construct Generative Adversarial Network Through Large Language Models For Explainable Fake News Detection |
提出LLM-GAN以解决可解释假新闻检测问题 |
large language model |
|
|
| 8 |
Interpreting and Improving Large Language Models in Arithmetic Calculation |
揭示大语言模型算术计算机制,选择性微调提升数学能力 |
large language model |
|
|
| 9 |
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning |
提出Pinpoint Tuning,解决大语言模型中的谄媚问题,提升真诚度。 |
large language model |
✅ |
|
| 10 |
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs |
LongGenBench:提出长文本生成评测基准,揭示现有LLM在长上下文任务中的不足 |
large language model instruction following |
|
|
| 11 |
MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs |
提出MMLU-Pro+以评估LLMs的高阶推理与捷径学习问题 |
large language model |
✅ |
|
| 12 |
You Only Use Reactive Attention Slice For Long Context Retrieval |
提出基于反应式注意力切片的YOURA长文本检索方法,提升LLM推理效率。 |
large language model |
|
|
| 13 |
Training on the Benchmark Is Not All You Need |
提出基于选项洗牌的数据泄露检测方法,评估LLM在基准测试中的数据泄露程度 |
large language model |
|
|
| 14 |
Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study |
提出PruningRAG框架,解决RAG中多源知识融合与噪声干扰问题,并构建基准数据集。 |
large language model |
✅ |
|
| 15 |
Benchmarking Cognitive Domains for LLMs: Insights from Taiwanese Hakka Culture |
构建客家文化认知基准,评估大型语言模型在文化理解和处理中的能力。 |
large language model |
|
|
| 16 |
Political DEBATE: Efficient Zero-shot and Few-shot Classifiers for Political Text |
Political DEBATE:高效的政治文本零样本与少样本分类器 |
large language model |
|
|
| 17 |
Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis |
利用专家在环的大语言模型进行古代文本互文性分析 |
large language model |
|
|
| 18 |
AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction |
提出AgentRE框架,利用LLM解决复杂场景下的关系抽取难题 |
large language model |
✅ |
|
| 19 |
An Implementation of Werewolf Agent That does not Truly Trust LLMs |
提出结合LLM与规则的狼人杀Agent,提升对话一致性与逻辑性。 |
large language model |
|
|