| 1 |
Reasoning Capabilities and Invariability of Large Language Models |
提出几何图形推理基准,评估大型语言模型的逻辑推理能力和提示依赖性 |
large language model chain-of-thought |
|
|
| 2 |
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models |
提出SAGE框架,用于评估大语言模型的高阶社会认知能力 |
large language model |
|
|
| 3 |
Steering Large Language Models with Register Analysis for Arbitrary Style Transfer |
利用Register分析引导大语言模型实现任意风格迁移 |
large language model |
|
|
| 4 |
Large Language Models Understanding: an Inherent Ambiguity Barrier |
论证大型语言模型理解能力存在内在模糊性壁垒 |
large language model |
|
|
| 5 |
Block Circulant Adapter for Large Language Models |
提出基于分块循环矩阵适配器的LLM微调方法,降低存储和计算成本。 |
large language model |
|
|
| 6 |
A Comparative Study of Large Language Models and Human Personality Traits |
研究表明大型语言模型人格特质具有动态性和输入依赖性,并提出分布式人格框架。 |
large language model |
|
|
| 7 |
Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models |
提出一种基于提示的框架,用于触发和量化大型语言模型中的幻觉现象。 |
large language model |
|
|
| 8 |
Red Teaming Large Language Models for Healthcare |
通过红队测试发现大型语言模型在医疗领域的潜在危害 |
large language model |
|
|
| 9 |
Rethinking Memory in LLM based Agents: Representations, Operations, and Emerging Topics |
针对LLM Agent记忆机制,提出包含表示、操作和新兴主题的系统性分类框架。 |
large language model |
✅ |
|
| 10 |
On the generalization of language models from in-context learning and finetuning: a controlled study |
研究表明上下文学习比微调在泛化方面更灵活,并提出加入推理轨迹以提升微调泛化能力。 |
large language model |
|
|
| 11 |
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) |
揭示LLM角色学习中的隐藏捷径,并提出基于不变信号强化的解决方案 |
large language model |
|
|
| 12 |
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension |
FreqKV:提出频域Key-Value压缩方法,高效扩展LLM上下文窗口 |
large language model |
|
|
| 13 |
MoxE: Mixture of xLSTM Experts with Entropy-Aware Routing for Efficient Language Modeling |
MoxE:结合xLSTM专家混合模型与熵感知路由,提升语言建模效率 |
large language model |
|
|
| 14 |
LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey |
首个LLM驱动的人机协作与交互系统综述,提升智能体可靠性与安全性。 |
large language model |
✅ |
|
| 15 |
KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM Negotiation |
提出KoACD:首个面向韩国青少年认知扭曲分析的大规模数据集,并采用多LLM协商方法提升标注质量。 |
large language model |
|
|