| 1 |
Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs |
提出RALP以解决知识图谱中链式推理问题 |
large language model chain-of-thought |
✅ |
|
| 2 |
Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe |
利用提示工程从大语言模型中挖掘低资源语言数据 |
large language model |
|
|
| 3 |
PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models |
提出PolicyBench和PolicyMoE,提升大语言模型在公共政策理解与应用能力 |
large language model |
|
|
| 4 |
Generating Effective CoT Traces for Mitigating Causal Hallucination |
提出一种CoT轨迹生成方法,缓解小模型在事件因果识别中的因果幻觉问题 |
large language model chain-of-thought |
|
|
| 5 |
Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Sequence-Level Likelihood |
提出TEPO,通过序列似然和KL散度约束优化LLM的token级策略,提升数学推理能力。 |
large language model chain-of-thought |
|
|
| 6 |
TimeMark: A Trustworthy Time Watermarking Framework for Exact Generation-Time Recovery from AIGC |
提出TimeMark:一种可信的时间水印框架,用于从AIGC中精确恢复生成时间 |
large language model TAMP |
|
|
| 7 |
SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration |
SpecBound:通过分层置信度校准的自适应有界自推测解码,加速LLM自回归推理。 |
large language model |
|
|
| 8 |
One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness |
揭示指令微调大语言模型对微小词汇约束的脆弱性,并分析其内在原因。 |
large language model |
|
|
| 9 |
MetFuse: Figurative Fusion between Metonymy and Metaphor |
提出MetFuse数据集,研究隐喻和转喻的融合,并提升隐喻和转喻识别效果。 |
large language model |
✅ |
|
| 10 |
The role of System 1 and System 2 semantic memory structure in human and LLM biases |
通过语义记忆网络结构分析人类和LLM偏见差异,揭示认知机制 |
large language model |
|
|
| 11 |
NaviRAG: Towards Active Knowledge Navigation for Retrieval-Augmented Generation |
NaviRAG:面向检索增强生成的主动知识导航框架 |
large language model |
|
|
| 12 |
Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification |
提出TRIAGE框架,自适应调整呼吸音频零样本分类的计算量 |
large language model |
|
|
| 13 |
Transforming External Knowledge into Triplets for Enhanced Retrieval in RAG of LLMs |
提出Tri-RAG,通过结构化三元组知识表示增强LLM的RAG检索效率。 |
large language model |
|
|
| 14 |
Latent-Condensed Transformer for Efficient Long Context Modeling |
提出Latent-Condensed Transformer,高效处理长文本建模中的KV缓存和计算复杂度问题 |
large language model |
|
|
| 15 |
Agentic Insight Generation in VSM Simulations |
提出解耦式Agent架构,提升VSM仿真中洞察生成的准确性和鲁棒性 |
large language model |
|
|
| 16 |
KoCo: Conditioning Language Model Pre-training on Knowledge Coordinates |
提出KoCo方法,通过知识坐标引导语言模型预训练,提升下游任务性能并加速收敛。 |
large language model |
|
|
| 17 |
From Myopic Selection to Long-Horizon Awareness: Sequential LLM Routing for Multi-Turn Dialogue |
DialRouter:面向多轮对话的序列LLM路由方法,提升长时交互性能 |
large language model |
|
|
| 18 |
Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness |
通过共识掩盖:解耦LLM正确性中的特权知识 |
large language model |
|
|
| 19 |
Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors |
通过零空间约束将激活引导编译到权重中,实现隐蔽后门攻击 |
large language model |
|
|
| 20 |
CompliBench: Benchmarking LLM Judges for Compliance Violation Detection in Dialogue Systems |
CompliBench:用于评估LLM在对话系统中违规检测能力的新基准 |
large language model |
|
|
| 21 |
ContextLens: Modeling Imperfect Privacy and Safety Context for Legal Compliance |
ContextLens:建模不完善的隐私和安全上下文以实现法律合规 |
large language model |
|
|
| 22 |
Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning |
提出SpreadsheetAgent,通过多模态多Agent推理实现鲁棒的真实世界电子表格理解。 |
large language model |
✅ |
|
| 23 |
CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation |
CodeSpecBench:用于评估LLM生成可执行行为规范能力的基准测试 |
large language model |
✅ |
|
| 24 |
Thought-Retriever: Don't Just Retrieve Raw Data, Retrieve Thoughts for Memory-Augmented Agentic Systems |
Thought-Retriever:通过检索思维而非原始数据,增强记忆增强型Agentic系统 |
large language model |
|
|
| 25 |
Beyond Majority Voting: Efficient Best-Of-N with Radial Consensus Score |
提出基于径向共识评分的高效Best-of-N方法,提升LLM答案选择的可靠性。 |
large language model |
|
|
| 26 |
AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs |
AgenticAI-DialogGen:用于微调和评估LLM记忆的Topic引导对话生成框架 |
large language model |
|
|