| 1 |
Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption |
提出Chain-of-Defensive-Thought,提升大语言模型在参考信息污染下的鲁棒性 |
large language model chain-of-thought |
|
|
| 2 |
LLM Enhancer: Merged Approach using Vector Embedding for Reducing Large Language Model Hallucinations with External Knowledge |
LLM Enhancer:融合向量嵌入与外部知识,减少大语言模型幻觉 |
large language model |
|
|
| 3 |
Improving Phishing Email Detection Performance of Small Large Language Models |
提出Prompt工程、解释增强微调和模型集成方法,提升小型LLM在钓鱼邮件检测中的性能。 |
large language model |
|
|
| 4 |
Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models |
提出信息引力模型,用场论解释大语言模型中的token选择过程 |
large language model |
|
|
| 5 |
A Framework to Assess the Persuasion Risks Large Language Model Chatbots Pose to Democratic Societies |
评估大型语言模型聊天机器人对民主社会构成的说服风险框架 |
large language model |
|
|
| 6 |
Computational Reasoning of Large Language Models |
提出Turing Machine Bench,评估LLM在规则遵循和状态管理方面的计算推理能力 |
large language model |
✅ |
|
| 7 |
WenyanGPT: A Large Language Model for Classical Chinese Tasks |
WenyanGPT:面向古文任务的大语言模型,性能显著超越现有模型 |
large language model |
|
|
| 8 |
Fane at SemEval-2025 Task 10: Zero-Shot Entity Framing with Large Language Models |
利用大型语言模型实现零样本实体框架分类,提升新闻叙事理解 |
large language model |
|
|
| 9 |
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think |
提出基于子思想聚合的LLM推理方法,提升复杂数学问题求解精度 |
large language model |
✅ |
|
| 10 |
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models |
提出TF1-EN-3M:一个用于训练小型开放语言模型的包含三百万条合成道德寓言的数据集。 |
instruction following |
|
|
| 11 |
Automatic Legal Writing Evaluation of LLMs |
提出oab-bench:用于自动评估LLM法律写作能力的新基准 |
large language model |
|
|
| 12 |
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification |
OSVBench:用于操作系统验证的LLM规范生成基准测试 |
large language model |
✅ |
|
| 13 |
HyPerAlign: Interpretable Personalized LLM Alignment via Hypothesis Generation |
HyPerAlign:通过假设生成实现可解释的个性化LLM对齐 |
large language model |
|
|
| 14 |
A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages |
提出基于生成式AI的声明检索系统,用于多语言社交媒体平台的事实核查。 |
large language model |
|
|
| 15 |
BrAIcht, a theatrical agent that speaks like Bertolt Brecht's characters |
BrAIcht:一种模仿布莱希特戏剧风格的AI对话代理 |
large language model |
|
|
| 16 |
Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training |
提出CrossIC-PT,通过跨语言上下文预训练增强LLM的语言适应能力 |
large language model |
|
|