| 1 |
A Survey on Benchmarks of Multimodal Large Language Models |
多模态大语言模型评测基准综述:全面评估与未来方向 |
large language model multimodal |
✅ |
|
| 2 |
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning |
Math-PUMA:通过渐进式向上多模态对齐增强数学推理能力 |
large language model multimodal |
✅ |
|
| 3 |
When Prompting Fails to Sway: Inertia in Moral and Value Judgments of Large Language Models |
揭示大型语言模型在道德和价值判断中存在的惯性,即使通过prompt干预。 |
large language model |
|
|
| 4 |
PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars |
PEDAL:利用多样化范例增强大语言模型贪婪解码,提升文本生成性能 |
large language model |
|
|
| 5 |
PsychoLex: Unveiling the Psychological Mind of Large Language Models |
PsychoLex:构建并评估面向心理学任务的波斯语和英语大型语言模型 |
large language model |
|
|
| 6 |
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions |
提出集成提示框架,发现大语言模型对提示描述内容不敏感,提示格式更重要。 |
large language model |
|
|
| 7 |
Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling |
提出Token Recycling,加速大语言模型推理,无需额外训练。 |
large language model |
|
|
| 8 |
MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector |
MIA-Tuner:利用指令调优大语言模型作为预训练文本检测器 |
large language model |
|
|
| 9 |
Chain of Thought Still Thinks Fast: APriCoT Helps with Thinking Slow |
提出APriCoT方法,缓解语言模型在MMLU任务中的偏差,提升推理的稳健性。 |
chain-of-thought |
|
|
| 10 |
Using large language models to estimate features of multi-word expressions: Concreteness, valence, arousal |
利用大型语言模型评估多词表达的具象性、效价和唤醒度 |
large language model |
|
|
| 11 |
Collaborative Cross-modal Fusion with Large Language Model for Recommendation |
提出CCF-LLM框架,通过协同跨模态融合增强LLM在推荐系统中的性能。 |
large language model |
|
|
| 12 |
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models |
SelectLLM:一种查询感知的LLM高效选择算法,提升推理效率。 |
large language model |
|
|
| 13 |
MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering |
MuRAR:一个简单高效的多模态检索与答案优化框架,用于多模态问答 |
multimodal |
|
|
| 14 |
Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm |
提出Med-PMC评估框架,用于评估多模态大语言模型在医疗个性化多模态咨询中的临床能力。 |
large language model multimodal |
✅ |
|
| 15 |
LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs |
系统评估并缓解LLM的输出格式偏差,提升模型在不同格式下的性能一致性。 |
large language model instruction following |
|
|
| 16 |
Ex3: Automatic Novel Writing by Extracting, Excelsior and Expanding |
Ex3:通过抽取、精进与扩展实现自动小说创作 |
large language model instruction following |
|
|
| 17 |
FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats |
FLEXTAF:通过灵活表格格式增强表格推理能力 |
large language model |
|
|
| 18 |
EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics |
EmoDynamiX:通过混合情绪和对话动态建模预测情感支持对话策略 |
large language model |
|
|
| 19 |
DAC: Decomposed Automation Correction for Text-to-SQL |
提出分解自动化纠错(DAC)方法,提升Text-to-SQL任务中LLM的SQL生成质量。 |
large language model |
|
|
| 20 |
Lower Layers Matter: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused |
提出LOL框架,通过多层融合对比解码和真值重聚焦缓解大语言模型的幻觉问题 |
large language model |
|
|
| 21 |
The Fellowship of the LLMs: Multi-Model Workflows for Synthetic Preference Optimization Dataset Generation |
提出基于多模型工作流的合成偏好优化数据集生成方法,提升数据集构建效率。 |
large language model |
|
|
| 22 |
CommunityKG-RAG: Leveraging Community Structures in Knowledge Graphs for Advanced Retrieval-Augmented Generation in Fact-Checking |
提出CommunityKG-RAG,利用知识图谱社区结构增强事实核查中的RAG性能 |
large language model |
|
|