| 1 |
MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments |
MERRIN:用于评估噪声Web环境中多模态证据检索与推理的基准 |
multimodal |
|
|
| 2 |
Adaptive Conformal Prediction for Improving Factuality of Generations by Large Language Models |
提出自适应共形预测方法,提升大语言模型生成结果的事实性。 |
large language model |
|
|
| 3 |
From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models |
提出MAGE框架,通过记忆图引导的无语料库卸载,解决大语言模型的信息遗忘问题 |
large language model |
|
|
| 4 |
Correct Prediction, Wrong Steps? Consensus Reasoning Knowledge Graph for Robust Chain-of-Thought Synthesis |
CRAFT:基于共识推理知识图谱的链式思考合成方法,提升LLM推理鲁棒性 |
chain-of-thought |
|
|
| 5 |
Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model |
提出双重增强产品捆绑方法,融合交互图学习与大语言模型,提升电商推荐效果。 |
large language model |
|
|
| 6 |
Beyond Static Personas: Situational Personality Steering for Large Language Models |
IRIS:面向大语言模型的情境化人格引导框架,无需训练。 |
large language model |
|
|
| 7 |
Robust Reward Modeling for Large Language Models via Causal Decomposition |
提出基于因果分解的鲁棒奖励模型,提升大语言模型对齐效果 |
large language model |
|
|
| 8 |
Foresight Optimization for Strategic Reasoning in Large Language Models |
提出FoPO,增强大语言模型在多智能体环境下的战略推理能力 |
large language model |
|
|
| 9 |
Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints |
提出受控基准测试框架,揭示大语言模型在复杂性增加时推理能力崩溃现象 |
large language model |
|
|
| 10 |
TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models |
提出TLoRA+,一种高效低秩参数微调方法,提升大语言模型在特定任务上的性能。 |
large language model |
|
|
| 11 |
MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging |
MedRCube:用于医学影像中多模态大语言模型细粒度和深度评估的多维框架 |
large language model multimodal |
✅ |
|
| 12 |
Synthesizing Instruction-Tuning Datasets with Contrastive Decoding |
提出CoDIT方法,通过对比解码合成指令调优数据集,提升模型指令遵循能力。 |
large language model instruction following |
|
|
| 13 |
From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines |
提出AuthGR框架,将权威性融入生成式检索,提升Web搜索的可靠性。 |
large language model multimodal |
|
|
| 14 |
Rhetorical Questions in LLM Representations: A Linear Probing Study |
通过线性探测研究LLM中反问句的表征,揭示其多重编码特性 |
large language model |
|
|
| 15 |
How Can We Synthesize High-Quality Pretraining Data? A Systematic Study of Prompt Design, Generator Model, and Source Data |
系统性研究提示词设计、生成模型与源数据对合成预训练数据质量的影响,并提出FinePhrase数据集。 |
large language model |
|
|
| 16 |
Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference |
提出校准推测解码CSD,通过频率引导候选选择加速LLM推理。 |
large language model |
|
|
| 17 |
IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages |
提出IndicDB,用于评估印度语言多语言Text-to-SQL能力。 |
large language model |
|
|
| 18 |
BenGER: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks |
BenGER:一个协同Web平台,用于端到端评测德语法律任务的大语言模型 |
large language model |
|
|
| 19 |
Interpretable Stylistic Variation in Human and LLM Writing Across Genres, Models, and Decoding Strategies |
通过词汇语法特征分析人类与LLM在不同领域和策略下的写作风格差异 |
large language model |
|
|
| 20 |
From Where Words Come: Efficient Regularization of Code Tokenizers Through Source Attribution |
提出SA-BPE,通过源属性正则化代码Tokenizer,减少低训练token。 |
large language model |
|
|
| 21 |
ToolOmni: Enabling Open-World Tool Use via Agentic learning with Proactive Retrieval and Grounded Execution |
ToolOmni:通过主动检索和具身执行的 Agentic 学习实现开放世界工具使用 |
large language model |
|
|
| 22 |
QuantileMark: A Message-Symmetric Multi-bit Watermark for LLMs |
QuantileMark:一种消息对称的LLM多比特水印方案,提升水印鲁棒性。 |
large language model |
✅ |
|
| 23 |
Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models |
提出Co-FactChecker框架,利用人类专家反馈协同增强大型推理模型的事实核查能力 |
large language model |
|
|
| 24 |
Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection |
提出解耦表征框架DRGD,提升AI文本检测在未知生成器上的泛化能力 |
large language model |
✅ |
|
| 25 |
YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference |
YOCO++:利用KV残差连接增强YOCO,实现高效LLM推理 |
large language model |
|
|
| 26 |
ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding |
ToolSpec:通过模式感知和检索增强的推测解码加速工具调用 |
large language model |
|
|