| 1 |
"Lost-in-the-Later": Framework for Quantifying Contextual Grounding in Large Language Models |
提出CoPE框架,揭示LLM在上下文理解中存在的“后置信息丢失”现象 |
large language model chain-of-thought |
|
|
| 2 |
MindFlow: Revolutionizing E-commerce Customer Support with Multimodal LLM Agents |
MindFlow:利用多模态LLM Agent革新电商客户支持 |
large language model multimodal |
|
|
| 3 |
Mechanistic Indicators of Understanding in Large Language Models |
通过机制可解释性,探究大语言模型中理解能力的涌现与层次 |
large language model |
|
|
| 4 |
On the Semantics of Large Language Models |
探究大型语言模型在词汇和句子层面的语义理解能力 |
large language model |
|
|
| 5 |
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities |
Gemini 2.5:通过高级推理、多模态、长上下文和新一代Agent能力突破前沿 |
multimodal |
|
|
| 6 |
An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques |
通过Prompt工程评估大型语言模型在文本摘要任务上的性能,并提出句子分块策略优化长文档摘要。 |
large language model |
|
|
| 7 |
PRIME: Large Language Model Personalization with Cognitive Dual-Memory and Personalized Thought Process |
PRIME:利用认知双记忆和个性化思维过程实现大语言模型个性化 |
large language model |
|
|
| 8 |
$\textit{Grahak-Nyay:}$ Consumer Grievance Redressal through Large Language Models |
Grahak-Nyay:利用大型语言模型解决印度消费者权益纠纷问题。 |
large language model |
|
|
| 9 |
Agentic Vehicles for Human-Centered Mobility |
提出Agentic Vehicles (AgVs)概念,弥合自动驾驶技术与以人为本的出行需求之间的差距。 |
large language model multimodal |
|
|
| 10 |
ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation |
ArtifactsBench:弥合LLM代码生成评估中视觉交互的鸿沟 |
large language model multimodal |
|
|
| 11 |
Why We Feel What We Feel: Joint Detection of Emotions and Their Opinion Triggers in E-commerce |
提出EOT-DETECT框架,联合检测电商评论中的情绪及其触发因素。 |
large language model chain-of-thought |
|
|
| 12 |
On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study |
研究表明,在最短路径推理任务中,基于低效推理轨迹训练的LLM泛化性更强。 |
large language model |
|
|
| 13 |
Reason to Rote: Rethinking Memorization in Reasoning |
探究LLM中记忆噪声标签与推理能力的关系,揭示良性记忆的机理 |
large language model |
|
|
| 14 |
Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences |
提出基于隐私配置文件的LLM查询重写框架,提升用户数据隐私保护。 |
large language model |
|
|
| 15 |
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions |
提出MemoryAgentBench,用于评估LLM Agent在多轮交互中的记忆能力 |
large language model |
|
|
| 16 |
Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization |
提出基于期望最大化的可解释助记符生成方法,辅助汉字学习 |
large language model |
|
|
| 17 |
Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations |
提出基于冻结视觉Unicode表征的Transformer LM,突破传统token嵌入语义限制。 |
large language model |
|
|
| 18 |
Knowledge-Aware Self-Correction in Language Models via Structured Memory Graphs |
提出基于结构化记忆图的知识感知自校正框架,提升语言模型的事实准确性 |
large language model |
|
|
| 19 |
PhoniTale: Phonologically Grounded Mnemonic Generation for Typologically Distant Language Pairs |
PhoniTale:面向音系学的助记符生成,解决跨语系语言学习的词汇习得难题 |
large language model |
|
|
| 20 |
Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification |
Co-DETECT:结合人类专家知识与大语言模型,协同发现文本分类中的边界案例 |
large language model |
|
|
| 21 |
Dialogue-Based Multi-Dimensional Relationship Extraction from Novels |
提出基于LLM的对话式多维度关系抽取方法,用于解决小说领域的人物关系抽取难题。 |
large language model |
|
|
| 22 |
Spec-TOD: A Specialized Instruction-Tuned LLM Framework for Efficient Task-Oriented Dialogue Systems |
Spec-TOD:面向任务型对话系统的高效指令调优LLM框架 |
large language model |
|
|
| 23 |
LLMs as Architects and Critics for Multi-Source Opinion Summarization |
提出M-OS-EVAL基准数据集,并探索LLM在多源意见摘要中的应用,显著提升用户参与度。 |
large language model |
|
|
| 24 |
"This Suits You the Best": Query Focused Comparative Explainable Summarization |
提出查询聚焦的比较型可解释摘要生成方法,并构建MS-Q2P数据集。 |
large language model |
|
|
| 25 |
LOOM-Scope: a comprehensive and efficient LOng-cOntext Model evaluation framework |
LOOM-Scope:一个全面高效的长文本模型评估框架 |
large language model |
|
|