| 1 |
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs |
提出高亮思维链(HoT)提示方法,提升LLM事实依据追溯能力并辅助人工验证。 |
large language model chain-of-thought |
|
|
| 2 |
Linear Representations of Political Perspective Emerge in Large Language Models |
大型语言模型中涌现政治立场的线性表征,可通过干预注意力头操控模型输出。 |
large language model |
|
|
| 3 |
Analyzing the Safety of Japanese Large Language Models in Stereotype-Triggering Prompts |
分析日语大型语言模型在刻板印象触发提示下的安全性 |
large language model |
|
|
| 4 |
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test |
EAGLE-3:通过训练时测试扩展大语言模型推理加速,提升数据规模利用率 |
large language model |
✅ |
|
| 5 |
Rotary Offset Features in Large Language Models |
揭示LLM中Rotary Embedding的Offset Features,并提供预测方法 |
large language model |
|
|
| 6 |
Persuade Me if You Can: A Framework for Evaluating Persuasion Effectiveness and Susceptibility Among Large Language Models |
PMIYC:评估大型语言模型说服力及易受说服性的自动化框架 |
large language model |
|
|
| 7 |
Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models |
提出ToolRet基准评测工具检索模型,并构建大规模训练数据集提升LLM工具使用能力。 |
large language model |
|
|
| 8 |
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs |
微软发布Phi-4-Mini系列模型,通过混合LoRA实现紧凑而强大的多模态语言能力。 |
multimodal |
|
|
| 9 |
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models |
针对大型语言模型的红队评估综述:构建安全的GenAI应用 |
large language model |
|
|
| 10 |
Automated Annotation of Evolving Corpora for Augmenting Longitudinal Network Data: A Framework Integrating Large Language Models and Expert Knowledge |
提出EALA框架,结合LLM与专家知识,自动标注演化语料以增强纵向网络数据。 |
large language model |
|
|
| 11 |
Detecting Stylistic Fingerprints of Large Language Models |
提出一种基于集成学习的LLM风格指纹检测方法,用于识别AI生成文本的来源。 |
large language model |
|
|
| 12 |
What do Large Language Models Say About Animals? Investigating Risks of Animal Harm in Generated Text |
提出AnimalHarmBench基准,评估大型语言模型生成文本中潜在的动物伤害风险。 |
large language model |
|
|
| 13 |
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom |
CrowdSelect:利用多LLM智慧进行合成指令数据选择,提升小模型指令遵循能力。 |
large language model instruction following |
✅ |
|
| 14 |
Interview AI-ssistant: Designing for Real-Time Human-AI Collaboration in Interview Preparation and Execution |
提出 Interview AI-ssistant,用于访谈准备和执行中的人机实时协作 |
large language model |
|
|
| 15 |
Persuasion at Play: Understanding Misinformation Dynamics in Demographic-Aware Human-LLM Interactions |
研究人口统计学背景下人-LLM交互中的错误信息传播动态 |
large language model |
|
|
| 16 |
Comparative Analysis of OpenAI GPT-4o and DeepSeek R1 for Scientific Text Categorization Using Prompt Engineering |
利用提示工程比较OpenAI GPT-4o和DeepSeek R1在科学文本分类中的性能 |
large language model |
|
|
| 17 |
Mind the (Belief) Gap: Group Identity in the World of LLMs |
研究LLM中的群体认同偏差,提出干预策略以减少信息误传并提升学习效果 |
large language model |
|
|
| 18 |
Can (A)I Change Your Mind? |
研究表明:大型语言模型在希伯来语环境下能有效改变人类观点 |
large language model |
|
|
| 19 |
From Language to Cognition: How LLMs Outgrow the Human Language Network |
研究表明LLM的语言能力发展与人脑语言网络关联,但超越人类后关联减弱 |
large language model |
|
|
| 20 |
$\texttt{SEM-CTRL}$: Semantically Controlled Decoding |
提出SEM-CTRL,通过语义控制解码保证LLM输出的句法和语义正确性 |
large language model |
|
|
| 21 |
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia |
研究发现LLM在乱序词理解中过度依赖词形,并提出SemRecScore评估语义重构能力 |
large language model |
|
|
| 22 |
Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of Summarization |
通过摘要任务评估LLMs对混合上下文幻觉的检测能力 |
large language model |
|
|