| 1 |
Multimodal Large Language Models Meet Multimodal Emotion Recognition and Reasoning: A Survey |
综述多模态大语言模型在情感识别与推理中的应用与挑战 |
large language model multimodal |
✅ |
|
| 2 |
Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning |
利用大型语言模型进行隐喻识别:比较RAG、提示工程和微调方法 |
large language model chain-of-thought |
|
|
| 3 |
Dual Mechanisms of Value Expression: Intrinsic vs. Prompted Values in Large Language Models |
研究揭示大语言模型中内在价值观与提示诱导价值观的双重机制 |
large language model instruction following |
|
|
| 4 |
Prompt and Parameter Co-Optimization for Large Language Models |
提出MetaTuner框架,联合优化Prompt和参数以提升大语言模型性能 |
large language model |
|
|
| 5 |
Can Large Language Models Express Uncertainty Like Human? |
提出语言置信度方法,提升大语言模型不确定性表达能力 |
large language model |
|
|
| 6 |
Beyond Overall Accuracy: A Psychometric Deep Dive into the Topic-Specific Medical Capabilities of 80 Large Language Models |
提出MedIRT框架,利用项目反应理论评估LLM的医学能力,揭示模型专长与缺陷。 |
large language model |
|
|
| 7 |
Towards Structured Knowledge: Advancing Triple Extraction from Regional Trade Agreements using Large Language Models |
利用大型语言模型从区域贸易协定中提取结构化知识,构建贸易知识图谱。 |
large language model |
|
|
| 8 |
Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding |
提出Learn2PD以解决大语言模型推理速度瓶颈问题 |
large language model |
|
|
| 9 |
Pretraining Large Language Models with NVFP4 |
提出NVFP4训练方法,实现4比特精度下大语言模型的稳定高效预训练。 |
large language model |
|
|
| 10 |
GateMABSA: Aspect-Image Gated Fusion for Multimodal Aspect-based Sentiment Analysis |
提出GateMABSA模型,通过门控多模态融合解决多模态情感分析中噪声过滤和跨模态对齐问题。 |
multimodal |
|
|
| 11 |
Understanding the Dilemma of Unlearning for Large Language Models |
提出unPact框架,揭示大语言模型不可靠的知识遗忘现象与灾难性遗忘困境 |
large language model |
|
|
| 12 |
Sanitize Your Responses: Mitigating Privacy Leakage in Large Language Models |
提出Self-Sanitize框架,缓解大语言模型中的隐私泄露问题 |
large language model |
✅ |
|
| 13 |
CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task |
提出CDT框架,从认知、领域和任务三维度全面评估大语言模型能力。 |
large language model |
✅ |
|
| 14 |
AlignX: Advancing Multilingual Large Language Models with Multilingual Representation Alignment |
AlignX:通过多语言表示对齐提升多语言大语言模型性能 |
large language model |
|
|
| 15 |
DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models |
DiffuGuard:揭示并修复扩散大语言模型中固有的安全漏洞 |
large language model |
✅ |
|
| 16 |
MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes |
MobileLLM-R1:通过开放训练方案探索十亿参数以下语言模型推理能力的极限 |
large language model chain-of-thought |
|
|
| 17 |
InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation |
提出InfLLM-V2:一种稠密-稀疏可切换注意力机制,实现模型从短序列到长序列的无缝适应。 |
large language model chain-of-thought |
✅ |
|
| 18 |
AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration |
AdaThink-Med:提出不确定性引导长度校准的医学自适应思考框架 |
large language model chain-of-thought |
|
|
| 19 |
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play |
AceSearcher:通过强化自博弈引导LLM进行推理和搜索,提升复杂推理任务性能。 |
large language model |
✅ |
|
| 20 |
SPECTRA: Revealing the Full Spectrum of User Preferences via Distributional LLM Inference |
SPECTRA:通过分布式的LLM推理揭示用户偏好的全谱,解决推荐系统中的长尾偏好问题。 |
large language model |
|
|
| 21 |
Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight |
提出可学习任务向量(LTV),提升ICL性能并提供机制性理解 |
large language model |
|
|
| 22 |
Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis |
通过注意力头分析,在上下文学习中定位任务识别和任务学习 |
large language model |
|
|
| 23 |
Calibrating Verbalized Confidence with Self-Generated Distractors |
提出DINCO方法,通过自生成干扰项校准LLM的置信度,提升可靠性。 |
large language model |
|
|
| 24 |
Not Wrong, But Untrue: LLM Overconfidence in Document-Based Queries |
LLM在文档问答中过度自信:揭示新闻场景下的幻觉问题与溯源挑战 |
large language model |
|
|
| 25 |
The Rise of AfricaNLP: Contributions, Contributors, and Community Impact (2005-2025) |
追踪非洲NLP发展:贡献分析、参与者识别与社区影响评估 |
large language model |
|
|
| 26 |
Fingerprinting LLMs via Prompt Injection |
LLMPrint:利用Prompt注入为LLM构建鲁棒指纹,实现模型溯源 |
large language model |
|
|
| 27 |
Generative Value Conflicts Reveal LLM Priorities |
ConflictScope:揭示LLM在价值冲突下的优先级偏好,并提出系统提示对齐方法。 |
large language model |
|
|
| 28 |
From Internal Representations to Text Quality: A Geometric Approach to LLM Evaluation |
利用LLM内部表征的几何特性评估文本质量,实现无参考文本质量评估 |
large language model |
|
|
| 29 |
Investigating Language and Retrieval Bias in Multilingual Previously Fact-Checked Claim Detection |
研究多语言预训练模型在事实核查中的语言和检索偏差 |
large language model |
|
|
| 30 |
Learning from Convenience Samples: A Case Study on Fine-Tuning LLMs for Survey Non-response in the German Longitudinal Election Study |
利用便利样本微调LLM,解决德国选举研究中调查无应答问题 |
large language model |
|
|
| 31 |
Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures |
提出超维探针,通过向量符号架构解码大型语言模型表征 |
large language model |
|
|
| 32 |
How Well Do LLMs Imitate Human Writing Style? |
提出免训练风格模仿分析框架,评估LLM模仿人类写作风格的能力 |
large language model |
|
|
| 33 |
BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications |
BOE-XSUM:发布西班牙法律公文的明晰语言极端摘要数据集,并验证微调LLM的有效性 |
large language model |
|
|
| 34 |
Expanding Computation Spaces of LLMs at Inference Time |
提出推理时LLM计算空间扩展方法,提升开放域问答和数学任务性能 |
chain-of-thought |
|
|
| 35 |
SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching |
SemShareKV:通过Token级LSH匹配为语义相似Prompt高效共享KVCache |
large language model |
|
|
| 36 |
Hallucination is Inevitable for LLMs with the Open World Assumption |
重新定义LLM幻觉:开放世界假设下,幻觉是大型语言模型泛化问题的必然结果 |
large language model |
|
|
| 37 |
ProxyAttn: Guided Sparse Attention via Representative Heads |
ProxyAttn:通过代表性注意力头引导的稀疏注意力机制,加速长文本处理。 |
large language model |
✅ |
|
| 38 |
Think Twice, Generate Once: Safeguarding by Progressive Self-Reflection |
提出渐进式自反思(PSR)方法,提升LLM在生成任务中的安全性。 |
large language model |
|
|
| 39 |
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents |
MemGen:为自进化Agent构建生成式潜在记忆,提升认知能力 |
large language model |
|
|
| 40 |
Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset |
提出SOBACO日语基准,评估LLM偏见缓解对文化常识的影响 |
large language model |
|
|
| 41 |
HarmMetric Eval: Benchmarking Metrics and Judges for LLM Harmfulness Assessment |
提出HarmMetric Eval,用于全面评估LLM有害性评估指标与判别器的质量。 |
large language model |
✅ |
|
| 42 |
MAS$^2$: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems |
提出MAS$^2$,一种自生成、自配置、自校正的多智能体系统,提升复杂任务性能。 |
large language model |
✅ |
|
| 43 |
How Training Data Shapes the Use of Parametric and In-Context Knowledge in Language Models |
揭示训练数据特性如何影响语言模型参数知识和上下文知识的利用 |
large language model |
|
|
| 44 |
SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents |
SimuHome:面向智能家居LLM代理的时间与环境感知基准测试 |
large language model |
|
|