| 1 |
Learning Domain Knowledge in Multimodal Large Language Models through Reinforcement Fine-Tuning |
提出基于强化微调的多模态大语言模型领域知识学习方法 |
large language model multimodal |
|
|
| 2 |
AuroraEdge-V-2B: A Faster And Stronger Edge Visual Large Language Model |
提出AuroraEdge-V-2B,一种快速、强大的边缘视觉大语言模型,加速工业应用部署。 |
large language model multimodal |
|
|
| 3 |
Trapped in the past? Disentangling fluid and crystallized intelligence of large language models using chess |
利用国际象棋解耦大语言模型的流体智力和晶体智力 |
large language model |
|
|
| 4 |
Standardizing Longitudinal Radiology Report Evaluation via Large Language Model Annotation |
提出基于大型语言模型的放射报告纵向信息自动标注流水线,提升报告评估标准化水平。 |
large language model |
|
|
| 5 |
Cite-While-You-Generate: Training-Free Evidence Attribution for Multimodal Clinical Summarization |
提出一种免训练的证据溯源框架,用于多模态临床摘要生成。 |
multimodal |
|
|
| 6 |
Strategies for Span Labeling with Large Language Models |
针对LLM的Span标注,提出LogitMatch约束解码方法,提升匹配精度。 |
large language model |
|
|
| 7 |
Large Language Models as Automatic Annotators and Annotation Adjudicators for Fine-Grained Opinion Analysis |
利用大语言模型作为自动标注器和仲裁器,解决细粒度情感分析的数据标注难题。 |
large language model |
|
|
| 8 |
Exploring the Effects of Alignment on Numerical Bias in Large Language Models |
研究对齐方式对大语言模型数值偏差的影响,并提出缓解策略。 |
large language model |
|
|
| 9 |
Persuasion Tokens for Editing Factual Knowledge in LLMs |
提出说服令牌(P-Tokens),实现LLM中高效的事实知识编辑,无需特定示例。 |
large language model |
|
|
| 10 |
Curate-Train-Refine: A Closed-Loop Agentic Framework for Zero Shot Classification |
提出Curate-Train-Refine框架,利用LLM动态生成监督信号进行零样本分类 |
large language model |
|
|
| 11 |
White-Box Sensitivity Auditing with Steering Vectors |
提出基于激活向量调控的白盒敏感性审计框架,用于评估LLM中的潜在偏见。 |
large language model |
✅ |
|
| 12 |
PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice |
PLawBench:一个基于规则的真实法律实践LLM评估基准 |
large language model |
✅ |
|
| 13 |
Attention-MoA: Enhancing Mixture-of-Agents via Inter-Agent Semantic Attention and Deep Residual Synthesis |
提出Attention-MoA,通过Agent间语义注意力机制增强混合Agent模型性能。 |
large language model |
|
|
| 14 |
LOGICAL-COMMONSENSEQA: A Benchmark for Logical Commonsense Reasoning |
提出LOGICAL-COMMONSENSEQA基准,用于评估常识推理中的逻辑组合能力。 |
chain-of-thought |
|
|
| 15 |
Jacobian Scopes: token-level causal attributions in LLMs |
提出Jacobian Scopes,用于量化LLM中token级别因果归因,揭示模型预测的关键影响因素。 |
large language model |
✅ |
|
| 16 |
Cross-Lingual Activation Steering for Multilingual Language Models |
提出跨语言激活调控(CLAS)方法,提升多语言模型在低资源语言上的性能。 |
large language model |
|
|
| 17 |
Select or Project? Evaluating Lower-dimensional Vectors for LLM Training Data Explanations |
提出基于架构信息的梯度选择方法,提升LLM训练数据解释的效率与准确性 |
large language model |
|
|
| 18 |
MultiLexNorm++: A Unified Benchmark and a Generative Model for Lexical Normalization for Asian Languages |
提出MultiLexNorm++基准和基于LLM的生成模型,用于亚洲语言词汇归一化 |
large language model |
|
|
| 19 |
How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants |
提出RPEval基准与RP-Reasoner模型,解决个性化LLM助手中的非理性记忆利用问题 |
large language model |
✅ |
|
| 20 |
PROST-LLM: Progressively Enhancing the Speech-to-Speech Translation Capability in LLMs |
PROST-LLM:渐进式提升大语言模型语音到语音翻译能力 |
large language model |
|
|
| 21 |
Retrieve-Refine-Calibrate: A Framework for Complex Claim Fact-Checking |
提出Retrieve-Refine-Calibrate框架,提升复杂声明事实核查的准确性。 |
large language model |
|
|
| 22 |
SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine |
提出SearchLLM,通过搜索引擎辅助检测LLM生成的复述文本。 |
large language model |
|
|
| 23 |
TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization |
提出Turn-Level GRPO算法,用于解决迭代优化任务中turn级别的精细化优化问题。 |
large language model |
|
|