| 1 |
H2HMem: A Multimodal Memory Benchmark for Agents in Human-Human Interactions |
提出H2HMem以解决人际互动中的多模态记忆评估问题 |
large language model multimodal |
|
|
| 2 |
PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models |
提出PsychoSafe框架以改善大语言模型的拒绝响应 |
large language model |
|
|
| 3 |
When Built-in Thinking Helps and Hurts: Constraint-Level Error Shifts in Instruction Following |
研究思维模式对指令遵循的影响及其局限性 |
instruction following |
|
|
| 4 |
Civil Court Simulation with Large Language Models |
提出多智能体框架以模拟中国民事法庭审判 |
large language model |
✅ |
|
| 5 |
Detecting Differences Is Not Understanding Structure: Large Language Models Fail at Graph Isomorphism |
揭示大型语言模型在图同构理解中的局限性 |
large language model |
|
|
| 6 |
In-Context Learning for the Imputation of Public Opinion Data with Large Language Models |
提出基于上下文学习的缺失公共意见数据填补方法 |
large language model |
|
|
| 7 |
Explicit Representation Alignment for Multimodal Sentiment Analysis |
提出显式表示对齐方法以解决多模态情感分析中的表示不一致问题 |
multimodal |
|
|
| 8 |
Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving |
提出多视角视觉问答基准以解决自主驾驶中的证据识别问题 |
large language model multimodal |
|
|
| 9 |
SafeRun: Enabling Determinism in LLM Planning for Running |
提出SafeRun以解决大型语言模型在规划中的确定性问题 |
large language model instruction following |
✅ |
|
| 10 |
IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking |
提出IS-CoT框架以解决长文本生成中的崩溃问题 |
large language model chain-of-thought |
|
|
| 11 |
Is Text All You Need? Text as a Universal Information Bottleneck for Speech LLMs |
提出Convex Gate以解决语音与语言模型的融合问题 |
large language model multimodal |
|
|
| 12 |
Code Is More Than Text: Uncertainty Estimation for Code Generation |
提出代码生成中的不确定性估计方法以提高安全性 |
large language model |
|
|
| 13 |
Unified Energy for Invariant and Independent Decoding in Diffusion Language Models |
提出统一能量以解决扩散语言模型中的解码不变性与独立性问题 |
large language model |
|
|
| 14 |
SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance |
提出SEF-CLGC框架以提升语言模型推理性能 |
large language model |
|
|
| 15 |
Cross-Modal Masking for Robust Silent Speech Synthesis Using sEMG and Lipreading |
提出跨模态掩蔽以增强无声语音合成的鲁棒性 |
multimodal |
|
|
| 16 |
Gradient-Guided Reward Optimization for Inference-time Alignment |
提出梯度引导奖励优化以解决推理时对齐问题 |
large language model |
✅ |
|
| 17 |
Interpretable Crisis Behavior Analysis Using Mobility and Social Media Data |
提出统一管道以分析危机行为,整合移动与社交媒体数据 |
multimodal |
|
|
| 18 |
MUDIDI: A Two-Stage Framework for Multilingual Dictionary Digitization with Language Models |
提出MUDIDI框架以解决多语言词典数字化问题 |
large language model |
✅ |
|
| 19 |
What Should a Skill Remember? Quality-Cost Trade-offs in Cost-Aware Skill Rewriting for Language Model Agents |
提出成本感知技能重写方法以优化语言模型代理的性能 |
large language model |
✅ |
|
| 20 |
LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks |
提出LexRubric以解决开放式法律任务评估问题 |
large language model |
✅ |
|
| 21 |
Multi-Hop Knowledge Composition is Bound by Pretraining Exposure |
提出多跳知识组合方法以解决语言模型推理不足问题 |
large language model |
|
|
| 22 |
How Far Can Prompting Go for Minimal-Edit Ukrainian Grammatical Error Correction? |
评估多种语言模型在乌克兰语语法错误纠正中的表现 |
large language model |
|
|
| 23 |
TruthSplit: Operationalizing Conditional Validity in Arguments Through Multi-Perspective Reasoning |
提出TruthSplit以解决多视角论证分析中的条件有效性问题 |
large language model |
|
|
| 24 |
Symbolic and Abstractive Reasoning with Complex Visual Queries |
提出复杂视觉查询以解决多模态大语言模型的推理挑战 |
large language model |
|
|
| 25 |
Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating |
提出Alignment Gating以解决语言模型的紧急失调问题 |
large language model |
|
|
| 26 |
Beyond Averages: Evaluating LLMs on Human Survey Replication at the Distributional Level |
提出基于分布层面的评估方法以改进LLM对人类调查的模拟 |
multimodal |
|
|
| 27 |
Language-Aware Token Boosting: LLM Language Confusion Reduction Without Tuning |
提出无调优的语言感知令牌增强方法以减少语言混淆 |
large language model |
✅ |
|