| 1 |
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought |
提出ReGuLaR,利用渲染的思维链指导变分隐空间推理,提升计算效率和推理效果。 |
large language model chain-of-thought |
✅ |
|
| 2 |
Towards Resiliency in Large Language Model Serving with KevlarFlow |
KevlarFlow:面向大规模语言模型服务,提升硬件故障下的系统韧性 |
large language model |
|
|
| 3 |
Character as a Latent Variable in Large Language Models: A Mechanistic Account of Emergent Misalignment and Conditional Safety Failures |
揭示大语言模型中角色扮演诱导的潜在风险,强调行为倾向而非孤立错误 |
large language model |
|
|
| 4 |
FNF: Functional Network Fingerprint for Large Language Models |
提出功能网络指纹FNF,用于检测大型语言模型的知识产权侵权。 |
large language model |
✅ |
|
| 5 |
DIFFA-2: A Practical Diffusion Large Language Model for General Audio Understanding |
DIFFA-2:一种实用的扩散大语言模型,用于通用音频理解 |
large language model |
✅ |
|
| 6 |
Large Language Model Agents Are Not Always Faithful Self-Evolvers |
揭示LLM Agent自我进化中经验依赖的非忠实性问题 |
large language model |
|
|
| 7 |
Residual Context Diffusion Language Models |
提出残差上下文扩散(RCD)模块,提升扩散语言模型(dLLM)的推理精度和效率。 |
large language model instruction following |
|
|
| 8 |
Towards the Holographic Characteristic of LLMs for Efficient Short-text Generation |
针对短文本生成,论文揭示LLM全息特性并提出高效插件HOLO |
large language model chain-of-thought |
|
|
| 9 |
InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning |
InstructDiff:通过差分熵进行领域自适应数据选择,高效微调大语言模型 |
large language model instruction following |
|
|
| 10 |
MM-THEBench: Do Reasoning MLLMs Think Reasonably? |
提出MM-THEBench,评估推理多模态大模型中间CoT中的幻觉问题 |
large language model multimodal |
|
|
| 11 |
LLMs Explain't: A Post-Mortem on Semantic Interpretability in Transformer Models |
探讨LLMs的语义可解释性问题及其局限性 |
large language model |
|
|
| 12 |
DART-ing Through the Drift: Dynamic Tracing of Knowledge Neurons for Adaptive Inference-Time Pruning |
DART:通过动态追踪知识神经元实现自适应推理时剪枝 |
large language model |
✅ |
|
| 13 |
Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry |
提出基于表征的INSPECTOR框架,利用小模型实现高效、可靠、可解释的LLM评判。 |
large language model |
|
|
| 14 |
SpanNorm: Reconciling Training Stability and Performance in Deep Transformers |
SpanNorm:平衡深度Transformer训练稳定性和性能的新型归一化方法 |
large language model |
|
|
| 15 |
Are LLM Evaluators Really Narcissists? Sanity Checking Self-Preference Evaluations |
提出评估器质量基线,消除LLM自偏好评估中的噪声,提升评估可靠性 |
large language model |
|
|
| 16 |
Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs |
提出跨语言对齐引导(CLAS)框架,用于缓解多语言LLM中的政治偏见 |
large language model |
|
|
| 17 |
UPA: Unsupervised Prompt Agent via Tree-Based Search and Selection |
提出UPA:一种基于树搜索与选择的无监督Prompt Agent,用于自动Prompt优化。 |
large language model |
|
|
| 18 |
Deep Search with Hierarchical Meta-Cognitive Monitoring Inspired by Cognitive Neuroscience |
提出DS-MCM框架,通过分层元认知监控提升深度搜索Agent的性能与鲁棒性 |
large language model |
|
|
| 19 |
MiTa: A Hierarchical Multi-Agent Collaboration Framework with Memory-integrated and Task Allocation |
MiTa:一种分层多智能体协作框架,集成记忆与任务分配,提升复杂任务效率。 |
large language model |
|
|
| 20 |
A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training |
揭示Transformer训练中Outlier驱动的重缩放机制,提升模型性能与量化鲁棒性 |
large language model |
|
|
| 21 |
Leveraging LLMs For Turkish Skill Extraction |
利用大型语言模型进行土耳其语技能提取,填补低资源语言技能提取空白。 |
large language model |
|
|
| 22 |
When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training |
研究多语言模型训练中共享概念空间的涌现与质量,揭示跨语言对齐的训练动态。 |
large language model |
|
|
| 23 |
Sparse or Dense? A Mechanistic Estimation of Computation Density in Transformer-based LLMs |
提出一种基于机制可解释性的方法,用于量化Transformer LLM中的计算密度。 |
large language model |
|
|
| 24 |
AR-BENCH: Benchmarking Legal Reasoning with Judgment Error Detection, Classification and Correction |
AR-BENCH:提出法律判决错误检测、分类与纠正的评测基准 |
large language model |
|
|
| 25 |
Models Know Models Best: Evaluation via Model-Preferred Formats |
提出基于模型偏好格式的动态评估方法,提升大语言模型zero-shot能力。 |
large language model |
|
|
| 26 |
Layer-wise Swapping for Generalizable Multilingual Safety |
提出层级交换方法,提升低资源语言大模型安全性 |
large language model |
|
|
| 27 |
$ρ$-$\texttt{EOS}$: Training-free Bidirectional Variable-Length Control for Masked Diffusion LLMs |
提出$ρ$-$ exttt{EOS}$,实现Masked扩散LLM的免训练双向变长控制。 |
large language model |
|
|