| 1 |
Recurrent Confidence Chain: Temporal-Aware Uncertainty Quantification in Large Language Models |
提出循环置信链,解决大语言模型中时序感知的不确定性量化问题 |
large language model chain-of-thought |
|
|
| 2 |
PhysicsSolutionAgent: Towards Multimodal Explanations for Numerical Physics Problem Solving |
提出PhysicsSolutionAgent,生成带Manim动画的物理问题讲解视频 |
large language model multimodal |
|
|
| 3 |
Trust Me, I'm an Expert: Decoding and Steering Authority Bias in Large Language Models |
揭示并调控大语言模型中的权威偏见,提升推理任务的可靠性 |
large language model |
|
|
| 4 |
Confidence over Time: Confidence Calibration with Temporal Logic for Large Language Model Reasoning |
利用时序逻辑校准LLM推理置信度,提升复杂任务表现 |
large language model |
|
|
| 5 |
Stop Taking Tokenizers for Granted: They Are Core Design Decisions in Large Language Models |
重新审视分词器:大型语言模型中的核心设计决策 |
large language model |
|
|
| 6 |
A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits |
首个组件级综述:探索大语言模型与多臂老虎机双向交互 |
large language model |
✅ |
|
| 7 |
Race, Ethnicity and Their Implication on Bias in Large Language Models |
通过可解释性分析揭示大语言模型中种族和族裔偏见的内在机制 |
large language model |
|
|
| 8 |
Structured Insight from Unstructured Data: Large Language Models for SDOH-Driven Diabetes Risk Prediction |
利用大语言模型从非结构化SDOH数据中提取信息,用于糖尿病风险预测 |
large language model |
|
|
| 9 |
Adversarial Alignment: Ensuring Value Consistency in Large Language Models for Sensitive Domains |
提出对抗对齐框架,提升大语言模型在敏感领域的价值观一致性 |
large language model |
|
|
| 10 |
Multimodal Multi-Agent Empowered Legal Judgment Prediction |
提出JurisMMA框架,通过多智能体协作解决法律判决预测中的多模态复杂推理问题 |
multimodal |
|
|
| 11 |
Who Does This Name Remind You of? Nationality Prediction via Large Language Model Associative Memory |
提出LAMA:利用大语言模型联想记忆进行国籍预测 |
large language model |
|
|
| 12 |
Beyond Memorization: Testing LLM Reasoning on Unseen Theory of Computation Tasks |
提出DFA构造基准测试,揭示LLM在形式语言推理中泛化能力不足 |
large language model chain-of-thought |
|
|
| 13 |
Intelligent Documentation in Medical Education: Can AI Replace Manual Case Logging? |
利用大型语言模型自动生成放射学病例记录,减轻医生负担并提高一致性 |
large language model chain-of-thought |
|
|
| 14 |
Unlearning in LLMs: Methods, Evaluation, and Open Challenges |
综述LLM中的Unlearning方法:分类、评估与挑战 |
large language model multimodal |
|
|
| 15 |
Tears or Cheers? Benchmarking LLMs via Culturally Elicited Distinct Affective Responses |
提出CEDAR基准,评估LLM在文化引发的情感反应上的理解能力,揭示语言一致性与文化对齐的差异。 |
large language model multimodal |
|
|
| 16 |
ChartAttack: Testing the Vulnerability of LLMs to Malicious Prompting in Chart Generation |
ChartAttack:提出针对LLM图表生成恶意提示的评估框架,揭示其脆弱性 |
large language model multimodal |
|
|
| 17 |
Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference |
提出SPTS框架,通过自预测token跳过加速长文本LLM推理。 |
large language model |
|
|
| 18 |
VISPA: Pluralistic Alignment via Automatic Value Selection and Activation |
VISPA:通过自动价值选择和激活实现大语言模型的多元化对齐 |
large language model |
|
|
| 19 |
LLM-as-RNN: A Recurrent Language Model for Memory Updates and Sequence Prediction |
LLM-as-RNN:利用语言记忆更新的循环语言模型,提升序列预测能力 |
large language model |
|
|
| 20 |
From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation |
提出FusionRAG,通过融合检索增强生成中的上下文信息加速LLM推理。 |
large language model |
|
|
| 21 |
A Shared Geometry of Difficulty in Multilingual Language Models |
揭示多语言模型中难度几何:浅层泛化,深层特化 |
large language model |
|
|
| 22 |
From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning |
提出Search-and-Replace Infilling框架,解决代码补全中上下文错误修正难题。 |
instruction following |
|
|
| 23 |
Leveraging Lora Fine-Tuning and Knowledge Bases for Construction Identification |
利用LoRA微调和知识库识别英语双宾语结构 |
large language model |
|
|
| 24 |
The Bitter Lesson of Diffusion Language Models for Agentic Workflows: A Comprehensive Reality Check |
扩散语言模型在Agent任务中表现不佳,需引入因果推理机制 |
large language model |
|
|
| 25 |
Injecting Knowledge from Social Science Journals to Improve Indonesian Cultural Understanding by LLMs |
提出 IndoSoSci 数据集,并结合 RAG 方法提升 LLM 对印度尼西亚文化的理解 |
large language model |
|
|
| 26 |
Rapport du Projet de Recherche TRAIMA |
TRAIMA项目:探索多模态交互在教育场景中的自动处理方法 |
multimodal |
|
|
| 27 |
Towards Robust Process Reward Modeling via Noise-aware Learning |
提出噪声感知学习框架,提升过程奖励模型在复杂推理中的鲁棒性 |
large language model |
|
|