| 1 |
Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling |
提出基于漫画的视觉推理范式,提升多模态时序和因果推理能力 |
large language model multimodal chain-of-thought |
|
|
| 2 |
Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models |
提出熵引导训练(EGT)方法,提升多模态推理奖励模型的训练效率与性能。 |
large language model multimodal |
|
|
| 3 |
Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts |
Avenir-Web:基于混合专家和经验模仿的多模态Web Agent,提升复杂Web环境下的任务执行能力 |
large language model multimodal |
|
|
| 4 |
Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models |
提出DDR-Bench,评估LLM在开放数据分析中的自主探索能力 |
large language model |
|
|
| 5 |
Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning |
提出UCT框架,通过免训练经验复用,使多模态推理Agent从工具使用者进化为创造者 |
multimodal |
|
|
| 6 |
Large Language Model and Formal Concept Analysis: a comparative study for Topic Modeling |
对比研究大型语言模型与形式概念分析在主题建模中的应用 |
large language model |
|
|
| 7 |
Optimizing Prompts for Large Language Models: A Causal Approach |
提出因果提示优化(CPO)框架,解决大语言模型提示工程中的泛化性和成本问题。 |
large language model |
|
|
| 8 |
MentisOculi: Revealing the Limits of Reasoning with Mental Imagery |
MentisOculi:揭示心智图像推理的局限性,评估多模态模型利用视觉信息的能力 |
large language model multimodal |
|
|
| 9 |
Live-Evo: Online Evolution of Agentic Memory from Continuous Feedback |
提出Live-Evo以解决在线记忆演化问题 |
large language model |
✅ |
|
| 10 |
Interpreting and Controlling LLM Reasoning through Integrated Policy Gradient |
提出IPG方法,通过积分策略梯度实现对LLM推理过程的解释与控制 |
large language model |
|
|
| 11 |
Light Alignment Improves LLM Safety via Model Self-Reflection with a Single Neuron |
提出基于单神经元门控机制的轻量级对齐方法,提升LLM安全性。 |
large language model |
✅ |
|
| 12 |
Geometric Analysis of Token Selection in Multi-Head Attention |
提出多头注意力几何分析框架,揭示Token选择机制与头部的专门化行为 |
large language model |
|
|
| 13 |
RedVisor: Reasoning-Aware Prompt Injection Defense via Zero-Copy KV Cache Reuse |
RedVisor:通过零拷贝KV缓存复用实现推理感知的提示注入防御 |
large language model |
|
|
| 14 |
PRISM: Parametrically Refactoring Inference for Speculative Sampling Draft Models |
PRISM:通过参数化重构推理解耦模型容量与推理成本,加速推测采样 |
large language model |
|
|
| 15 |
Breaking the Reversal Curse in Autoregressive Language Models via Identity Bridge |
提出身份桥接方法以解决自回归语言模型的反转诅咒问题 |
large language model |
|
|
| 16 |
Structure Enables Effective Self-Localization of Errors in LLMs |
提出Thought-ICS框架,通过结构化推理实现LLM的有效误差自定位与修正 |
chain-of-thought |
|
|
| 17 |
More Than a Quick Glance: Overcoming the Greedy Bias in KV-Cache Compression |
LASER-KV:通过精确LSH召回克服KV缓存压缩中的贪婪偏差 |
large language model |
|
|
| 18 |
Reasoning in a Combinatorial and Constrained World: Benchmarking LLMs on Natural-Language Combinatorial Optimization |
提出NLCO基准,评估LLM在自然语言组合优化问题中的推理能力 |
large language model |
|
|
| 19 |
See2Refine: Vision-Language Feedback Improves LLM-Based eHMI Action Designers |
See2Refine:利用视觉-语言反馈提升LLM驱动的eHMI动作设计 |
large language model |
|
|
| 20 |
Constrained Process Maps for Multi-Agent Generative AI Workflows |
提出多代理生成AI工作流的约束过程图以解决不确定性问题 |
large language model |
|
|
| 21 |
Do I Really Know? Learning Factual Self-Verification for Hallucination Reduction |
提出VeriFY框架,通过自验证学习减少大语言模型的事实性幻觉 |
large language model |
|
|
| 22 |
Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework |
提出4C框架,应对Agentic AI在开放环境中涌现的安全风险 |
large language model |
|
|
| 23 |
GRAB: An LLM-Inspired Sequence-First Click-Through Rate Prediction Modeling Paradigm |
GRAB:受LLM启发的序列优先点击率预测建模范式,提升广告收益和点击率。 |
large language model |
|
|
| 24 |
Meta Engine: A Unified Semantic Query Engine on Heterogeneous LLM-Based Query Systems |
提出Meta Engine,统一异构LLM语义查询系统,解决多模态数据查询难题。 |
large language model |
|
|
| 25 |
Beyond Dense States: Elevating Sparse Transcoders to Active Operators for Latent Reasoning |
提出LSTR:提升稀疏转码器为主动算子,用于潜在空间推理 |
chain-of-thought |
|
|
| 26 |
What LLMs Think When You Don't Tell Them What to Think About? |
研究揭示:在无主题引导下,大语言模型展现出显著且系统性的主题偏好 |
large language model |
|
|
| 27 |
The Strategic Foresight of LLMs: Evidence from a Fully Prospective Venture Tournament |
大型语言模型在战略预测中超越人类专家,尤其在众筹项目成功预测方面 |
large language model |
|
|