| 1 |
Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing |
提出基于因果推断和自适应专家路由的多模态大语言模型去偏框架,提升复杂推理任务的鲁棒性。 |
large language model multimodal |
|
|
| 2 |
Evaluating Multimodal Large Language Models on Spoken Sarcasm Understanding |
评估多模态大语言模型在口语讽刺理解中的表现 |
large language model multimodal |
|
|
| 3 |
Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models |
红队评估多模态语言模型:跨模态提示的有害性评估与模型对比 |
large language model multimodal |
|
|
| 4 |
Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLM |
提出解耦代理对齐(DPA)方法,缓解MLLM中语言先验冲突,提升视觉-语言对齐性能。 |
large language model multimodal |
✅ |
|
| 5 |
Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering |
针对大型语言模型问答解释,提出一种自然语言解释不确定性量化框架 |
large language model |
|
|
| 6 |
SMARTER: A Data-efficient Framework to Improve Toxicity Detection with Explanation via Self-augmenting Large Language Models |
SMARTER:利用自增强大语言模型,高效提升毒性检测能力并提供可解释性 |
large language model |
|
|
| 7 |
Semantic Representation Attack against Aligned Large Language Models |
提出语义表示攻击,提升大语言模型对抗攻击的成功率和自然性。 |
large language model |
|
|
| 8 |
CLEAR: A Comprehensive Linguistic Evaluation of Argument Rewriting by Large Language Models |
CLEAR:提出一套全面的语言学评估流程,用于评估大型语言模型在论证改写任务中的表现。 |
large language model |
|
|
| 9 |
Quantifying Self-Awareness of Knowledge in Large Language Models |
提出AQE方法以量化大语言模型知识自感知中的问题侧影响,并提出SCAO方法增强模型侧信号。 |
large language model |
|
|
| 10 |
LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models |
提出LNE-Blocking框架,有效评估并缓解大语言模型中的数据污染问题 |
large language model |
✅ |
|
| 11 |
Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models |
利用规则引导的大语言模型评估全球历史结构性压迫 |
large language model |
✅ |
|
| 12 |
What's the Best Way to Retrieve Slides? A Comparative Study of Multimodal, Caption-Based, and Hybrid Retrieval Techniques |
对比多模态、文本和混合检索技术,探究最佳幻灯片检索方法 |
multimodal |
|
|
| 13 |
Fair-GPTQ: Bias-Aware Quantization for Large Language Models |
Fair-GPTQ:面向大语言模型的偏见感知量化方法,提升公平性。 |
large language model |
|
|
| 14 |
Large Language Model probabilities cannot distinguish between possible and impossible language |
大型语言模型概率无法区分语言的可能性与不可能 |
large language model |
|
|
| 15 |
LLM-OREF: An Open Relation Extraction Framework Based on Large Language Models |
提出基于大语言模型的开放关系抽取框架LLM-OREF,无需人工干预即可预测新关系。 |
large language model |
✅ |
|
| 16 |
A Comparative Evaluation of Large Language Models for Persian Sentiment Analysis and Emotion Detection in Social Media Texts |
对比评估大型语言模型在波斯语社交媒体文本情感分析与情绪检测中的表现 |
large language model |
|
|
| 17 |
Evaluating Large Language Models for Cross-Lingual Retrieval |
评估大语言模型在跨语言检索中的应用,揭示检索器与重排序器间的交互影响。 |
large language model |
|
|
| 18 |
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models |
MUSE:基于MCTS的大语言模型多轮对话安全红队测试框架 |
large language model |
✅ |
|
| 19 |
ParlAI Vote: A Web Platform for Analyzing Gender and Political Bias in Large Language Models |
ParlAI Vote:用于分析大型语言模型中性别和政治偏见的Web平台 |
large language model |
|
|
| 20 |
Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models |
利用大型语言模型进行非结构化临床记录的主题分析,并提出标准化评估框架。 |
large language model |
|
|
| 21 |
TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete Modalities |
提出TriSPrompt,解决多模态谣言检测中模态缺失问题。 |
multimodal |
|
|
| 22 |
UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets |
提出UnifiedVisual框架,构建统一视觉语言数据集,促进多模态理解与生成。 |
large language model multimodal |
✅ |
|
| 23 |
TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding |
TableDART:提出动态自适应多模态路由框架,用于表格理解。 |
large language model multimodal |
|
|
| 24 |
An Evaluation-Centric Paradigm for Scientific Visualization Agents |
提出科学可视化Agent的评测范式,促进Agent能力提升与领域创新 |
large language model |
|
|
| 25 |
LLM-Assisted Topic Reduction for BERTopic on Social Media Data |
提出LLM辅助的BERTopic主题降维方法,提升社交媒体数据主题建模效果。 |
large language model |
|
|
| 26 |
PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting |
PILOT:利用心理语言学输出目标引导合成数据生成,提升控制精度。 |
large language model |
|
|
| 27 |
TextMine: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action |
TextMine提出了一种本体指导的LLM流程,用于人道主义排雷行动中的知识提取。 |
large language model |
|
|
| 28 |
Benchmarking and Improving LLM Robustness for Personalized Generation |
提出PERG框架与Pref-Aligner方法,提升LLM在个性化生成中的事实性与鲁棒性。 |
large language model |
|
|
| 29 |
RoadMind: Towards a Geospatial AI Expert for Disaster Response |
RoadMind:利用地理空间AI专家系统辅助灾难响应 |
large language model |
|
|
| 30 |
Real, Fake, or Manipulated? Detecting Machine-Influenced Text |
提出HERO模型,用于区分人类撰写、机器生成、机器润色和机器翻译的文本。 |
large language model |
|
|
| 31 |
PolBiX: Detecting LLMs' Political Bias in Fact-Checking through X-phemisms |
PolBiX:通过委婉语检测大型语言模型在事实核查中的政治偏见 |
large language model |
|
|
| 32 |
ATTS: Asynchronous Test-Time Scaling via Conformal Prediction |
提出ATTS:一种基于保形预测的异步测试时缩放框架,显著加速LLM推理。 |
large language model |
✅ |
|
| 33 |
SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding |
提出SynParaSpeech框架,自动合成大规模副语言数据集,提升语音生成和理解能力。 |
TAMP |
✅ |
|
| 34 |
Explicit vs. Implicit Biographies: Evaluating and Adapting LLM Information Extraction on Wikidata-Derived Texts |
通过LoRA微调提升LLM在Wikidata文本信息抽取中处理隐式信息的能力 |
large language model |
|
|
| 35 |
LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring |
提出Roundtable Essay Scoring (RES)框架,利用多智能体辩证推理提升作文自动评分效果。 |
large language model |
|
|
| 36 |
ReCoVeR the Target Language: Language Steering without Sacrificing Task Performance |
提出ReCoVeR,通过语言引导向量减少LLM的语言混淆,同时保持任务性能。 |
large language model |
✅ |
|
| 37 |
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Deliberation |
提出Align3以解决大语言模型的规范对齐问题 |
large language model |
|
|
| 38 |
Reveal and Release: Iterative LLM Unlearning with Self-generated Data |
提出Reveal-and-Release迭代框架,利用自生成数据实现大语言模型高效不可学习。 |
large language model |
|
|
| 39 |
Controlling Language Difficulty in Dialogues with Linguistic Features |
提出基于语言特征控制的对话系统,提升LLM在语言学习中的应用。 |
large language model |
|
|
| 40 |
Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors |
评估大语言模型模仿个人写作风格能力:现有模型在非正式文体中表现不足 |
large language model |
|
|
| 41 |
Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction |
OmniGEC:提出多语言语法纠错的银标准数据集,促进跨语言GEC模型发展 |
large language model |
|
|