| 1 |
Dual Knowledge-Enhanced Two-Stage Reasoner for Multimodal Dialog Systems |
提出DK2R模型,利用双重知识增强多模态对话系统中的文本回复生成。 |
large language model multimodal |
|
|
| 2 |
Bias after Prompting: Persistent Discrimination in Large Language Models |
揭示Prompting后的大语言模型偏见:持续存在的歧视现象 |
large language model |
|
|
| 3 |
GENUINE: Graph Enhanced Multi-level Uncertainty Estimation for Large Language Models |
GENUINE:图增强多层次不确定性估计,提升大语言模型可靠性 |
large language model |
✅ |
|
| 4 |
Automated Item Neutralization for Non-Cognitive Scales: A Large Language Model Approach to Reducing Social-Desirability Bias |
利用大型语言模型减少个性评估中的社会期望偏差 |
large language model |
|
|
| 5 |
Are Humans as Brittle as Large Language Models? |
对比人类与大语言模型,揭示提示词修改对文本分类任务的影响 |
large language model |
|
|
| 6 |
M-BRe: Discovering Training Samples for Relation Extraction from Unlabeled Texts with Large Language Models |
提出M-BRe框架,利用大语言模型从无标注文本中高效挖掘关系抽取训练样本 |
large language model |
|
|
| 7 |
NOWJ@COLIEE 2025: A Multi-stage Framework Integrating Embedding Models and Large Language Models for Legal Retrieval and Entailment |
NOWJ团队提出多阶段框架,融合嵌入模型与大语言模型,用于法律检索和蕴含任务。 |
large language model |
|
|
| 8 |
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction |
LongEmotion:提出用于评估大语言模型在长文本交互中情感智能的基准 |
large language model |
✅ |
|
| 9 |
Are LLMs Enough for Hyperpartisan, Fake, Polarized and Harmful Content Detection? Evaluating In-Context Learning vs. Fine-Tuning |
对比In-Context Learning与微调,评估大语言模型在检测有害内容方面的能力。 |
large language model chain-of-thought |
|
|
| 10 |
VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS Agents |
提出VeriOS,通过查询驱动的人机交互提升OS Agent在不可信环境下的可靠性 |
large language model multimodal |
✅ |
|
| 11 |
Dynamic Prompt Fusion for Multi-Task and Cross-Domain Adaptation in LLMs |
提出动态Prompt融合框架,提升LLM在多任务和跨领域场景下的泛化能力 |
large language model |
|
|
| 12 |
Evolution and compression in LLMs: On the emergence of human-aligned categorization |
研究表明,大型语言模型可以通过迭代学习进化出与人类对齐的语义分类系统。 |
large language model |
|
|
| 13 |
No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models |
研究表明:人格化提示可能导致LLM产生虚假拒绝,但影响程度或被高估 |
large language model |
|
|
| 14 |
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge |
提出SimpleQA Verified,用于可靠评估LLM参数知识的事实性,并缓解幻觉问题。 |
large language model |
|
|
| 15 |
Biased Tales: Cultural and Topic Bias in Generating Children's Stories |
Biased Tales:揭示并分析LLM生成儿童故事中的文化和主题偏见 |
large language model |
|
|
| 16 |
From Detection to Mitigation: Addressing Gender Bias in Chinese Texts via Efficient Tuning and Voting-Based Rebalancing |
提出基于LoRA微调和投票机制的中文性别偏见检测与缓解方法 |
large language model |
|
|
| 17 |
ALLabel: Three-stage Active Learning for LLM-based Entity Recognition using Demonstration Retrieval |
提出ALLabel,一种基于演示检索的三阶段主动学习框架,用于提升LLM在实体识别中的性能。 |
large language model |
|
|
| 18 |
Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents |
提出Tinkatongue框架,评估LLM智能体在交互中学习新语言的能力 |
large language model |
|
|
| 19 |
PersonaFuse: A Personality Activation-Driven Framework for Enhancing Human-LLM Interactions |
PersonaFuse:基于人格激活的框架,增强人与LLM的交互 |
large language model |
|
|
| 20 |
Does This Look Familiar to You? Knowledge Analysis via Model Internal Representations |
提出KAMIR方法,通过模型内部表征分析进行高效训练数据选择,提升模型泛化能力。 |
large language model |
|
|