| 1 |
TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought |
TCM-DiffRAG:基于知识图谱和思维链的中医个性化辨证论治方法 |
large language model chain-of-thought |
|
|
| 2 |
A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations |
提出MiSTER-E模型,通过混合专家机制解决对话情感识别中的多模态融合问题。 |
large language model multimodal |
|
|
| 3 |
Parallel Continuous Chain-of-Thought with Jacobi Iteration |
提出基于Jacobi迭代的并行连续思维链PCCoT,加速LLM推理。 |
large language model chain-of-thought |
|
|
| 4 |
Modality Collapse as Mismatched Decoding: Information-Theoretic Limits of Multimodal LLMs |
提出多模态LLM解码不匹配问题以提升信息提取能力 |
multimodal |
|
|
| 5 |
Inference-Cost-Aware Dynamic Tree Construction for Efficient Inference in Large Language Models |
提出CAST:一种推理成本感知的动态树构建方法,加速LLM推理。 |
large language model |
|
|
| 6 |
Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models |
量化大语言模型中的说服力与警惕性,揭示AI安全新视角 |
large language model |
|
|
| 7 |
Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs |
提出KGT框架以解决大语言模型与知识图谱间的粒度不匹配问题 |
large language model |
|
|
| 8 |
Probing for Knowledge Attribution in Large Language Models |
提出AttriWiki自监督数据管道,用于探究大语言模型知识归属问题 |
large language model |
|
|
| 9 |
Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models |
提出CultureManager,解决大语言模型中任务相关的文化对齐问题 |
large language model |
|
|
| 10 |
When Large Multimodal Models Confront Evolving Knowledge: Challenges and Explorations |
提出MMEVOKE基准,探索大模型在多模态演进知识注入中的挑战与方法。 |
multimodal |
|
|
| 11 |
DeVisE: Behavioral Testing of Medical Large Language Models |
DeVisE:通过行为测试评估医学大型语言模型在临床推理中的稳健性 |
large language model |
|
|
| 12 |
Large Language Models are Algorithmically Blind |
揭示大语言模型在算法推理上的局限性:算法盲区 |
large language model |
|
|
| 13 |
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models |
提出高效注意力跳跃(EAS)方法,加速多模态大语言模型推理并保持参数效率。 |
large language model |
|
|
| 14 |
Natural Language Declarative Prompting (NLD-P): A Modular Governance Method for Prompt Design Under Model Drift |
提出自然语言声明式提示(NLD-P)方法,应对大语言模型漂移下的提示工程挑战。 |
large language model instruction following |
|
|
| 15 |
Imagination Helps Visual Reasoning, But Not Yet in Latent Space |
质疑隐空间推理有效性,提出显式文本想象方法CapImagine提升视觉推理性能 |
large language model multimodal |
|
|
| 16 |
SQaLe: A Large Text-to-SQL Corpus Grounded in Real Schemas |
提出SQaLe:一个基于真实Schema的大规模Text-to-SQL数据集,提升模型泛化能力。 |
large language model |
|
|
| 17 |
Causality $\neq$ Invariance: Function and Concept Vectors in LLMs |
揭示LLM中函数向量非不变性:提出概念向量以提升跨领域泛化能力 |
large language model |
|
|
| 18 |
TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models |
提出TARAZ:波斯语短答案题基准,用于评估语言模型的文化理解能力 |
large language model |
|
|
| 19 |
Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent |
提出SALA框架,利用文体学特征辅助LLM代理评估和降低文本数据中的去匿名化风险。 |
large language model |
|
|
| 20 |
Generative Value Conflicts Reveal LLM Priorities |
ConflictScope:揭示LLM在价值冲突下的优先级偏好,并提出系统提示对齐方法。 |
large language model |
|
|
| 21 |
Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts |
通过深入文本分析而非类型标签扩展,评估GPT-5在引文情境分析中的能力及脆弱提示的影响。 |
large language model |
|
|
| 22 |
Ruyi2 Technical Report |
Ruyi2:基于Familial Model的自适应深度计算加速方案 |
large language model |
|
|
| 23 |
Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction |
提出条件化评论预测CCP任务,评估LLM模拟社交媒体用户行为的有效性 |
large language model |
|
|
| 24 |
Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching |
提出基于奖励引导拼接的扩散语言模型测试时缩放方法,提升复杂推理任务性能。 |
large language model |
|
|
| 25 |
Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads |
通过检索-转换头桥接潜在推理和目标语言生成 |
chain-of-thought |
|
|
| 26 |
Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference |
提出ReMix,通过连续空间语义传播加速DLLM推理,解决组合矛盾问题。 |
large language model |
|
|
| 27 |
MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations |
MTRAG-UN:用于多轮RAG对话开放挑战的基准数据集 |
large language model |
|
|
| 28 |
Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task |
评估大型语言模型在音素流畅性任务中模拟人类行为变异性的能力 |
large language model |
|
|
| 29 |
A Third Paradigm for LLM Evaluation: Dialogue Game-Based Evaluation using clembench |
提出clembench,一种基于对话游戏的LLM评估框架,易于扩展和复用。 |
large language model |
|
|
| 30 |
UPDESH: Synthesizing Grounded Instruction Tuning Data for 13 Indic Languages |
UPDESH:合成13种印度语言的指令微调数据,提升多语言AI性能 |
instruction following |
|
|
| 31 |
Fine-tuning Done Right in Model Editing |
重塑微调:提出LocFT-BF,显著提升模型编辑性能并扩展至更大规模模型 |
large language model |
|
|