| 1 |
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation |
提出多模态检索增强生成方法以解决LLMs的知识过时问题 |
large language model multimodal |
✅ |
|
| 2 |
Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting |
提出约束感知提示框架,缓解多模态空间关系中的幻觉问题 |
multimodal |
|
|
| 3 |
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning |
提出RELAY,通过循环对齐推理增强自回归链式思考能力 |
chain-of-thought |
✅ |
|
| 4 |
The Science of Evaluating Foundation Models |
构建评估框架以应对大型模型在多样化应用中的挑战 |
foundation model |
|
|
| 5 |
Exploring the Potential of Large Language Models to Simulate Personality |
探索大型语言模型在模拟人格特质方面的潜力,并构建相关数据集。 |
large language model |
|
|
| 6 |
Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models |
利用大型语言模型从生物医学文献中识别癌症疫苗佐剂名称 |
large language model |
|
|
| 7 |
GCoT: Chain-of-Thought Prompt Learning for Graphs |
提出GCoT:一种面向图数据的链式思考提示学习框架,无需文本信息。 |
chain-of-thought |
|
|
| 8 |
A Systematic Review on the Evaluation of Large Language Models in Theory of Mind Tasks |
系统性评测大型语言模型在心理理论任务中的表现 |
large language model |
|
|
| 9 |
Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG |
提出基于多样化增强的知识注入方法,提升领域RAG中LLM的性能。 |
large language model |
|
|
| 10 |
Data Augmentation to Improve Large Language Models in Food Hazard and Product Detection |
利用ChatGPT-4o-mini数据增强提升LLM在食品危害与产品检测中的性能 |
large language model |
✅ |
|
| 11 |
Contextual Compression Encoding for Large Language Models: A Novel Framework for Multi-Layered Parameter Space Pruning |
提出上下文压缩编码(CCE)框架,用于多层参数空间剪枝,提升大语言模型部署效率。 |
large language model |
|
|
| 12 |
Word Synchronization Challenge: A Benchmark for Word Association Responses for Large Language Models |
提出词语同步挑战:用于评估大语言模型词语联想能力的基准测试 |
large language model |
|
|
| 13 |
Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification |
全面评测大型语言模型在词汇、句法、句子和文档简化任务中的性能表现 |
large language model |
|
|
| 14 |
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models |
提出选择性自监督微调S3FT,提升大语言模型泛化能力并避免过拟合。 |
large language model |
|
|
| 15 |
Contextual Subspace Manifold Projection for Structural Refinement of Large Language Model Representations |
提出上下文子空间流形投影,用于结构化优化大语言模型表征。 |
large language model |
|
|
| 16 |
Stop Overvaluing Multi-Agent Debate -- We Must Rethink Evaluation and Embrace Model Heterogeneity |
重新评估多智能体辩论:强调异构模型,改进评测体系 |
large language model chain-of-thought |
|
|
| 17 |
No Need for Explanations: LLMs can implicitly learn from mistakes in-context |
LLM可从错误中隐式学习:无需显式解释即可提升数学推理能力 |
large language model chain-of-thought |
|
|
| 18 |
Salamandra Technical Report |
发布Salamandra:多语种开源解码器大型语言模型,提供2B、7B和40B三种规模。 |
large language model multimodal |
|
|
| 19 |
Universal Model Routing for Efficient LLM Inference |
UniRoute:为高效LLM推理提出通用模型路由方法,支持动态新增LLM |
large language model |
|
|
| 20 |
A New Query Expansion Approach via Agent-Mediated Dialogic Inquiry |
提出AMD框架,通过多智能体对话式探究提升查询扩展效果 |
large language model |
|
|
| 21 |
From Haystack to Needle: Label Space Reduction for Zero-shot Classification |
提出标签空间缩减方法以提升零样本分类性能 |
large language model |
|
|
| 22 |
Measuring Diversity in Synthetic Datasets |
提出DCScore,从分类角度评估合成数据集多样性,提升模型鲁棒性。 |
large language model |
✅ |
|
| 23 |
HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses |
提出HuDEx模型,融合幻觉检测与可解释性,提升LLM响应可靠性 |
large language model |
|
|
| 24 |
Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs |
挑战LLM文化价值观对齐的封闭式评估,提出更开放灵活的评估框架 |
large language model |
|
|
| 25 |
AI for Scaling Legal Reform: Mapping and Redacting Racial Covenants in Santa Clara County |
利用AI加速法律改革:在圣克拉拉县进行种族契约的识别与编辑 |
large language model |
|
|
| 26 |
IHEval: Evaluating Language Models on Following the Instruction Hierarchy |
IHEval:提出指令层级评估基准,衡量语言模型在指令冲突下的表现 |
instruction following |
|
|
| 27 |
Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples |
提出CLSD评估方法,利用LLM生成对抗样本,更有效地评估跨语言语义搜索模型。 |
large language model |
✅ |
|
| 28 |
Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation |
提出基于多智能体辩论的摘要忠实度评估方法,并引入歧义性维度。 |
large language model |
|
|
| 29 |
Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction |
提出基于语法错误解释的上下文示例检索方法,提升多语言语法纠错性能 |
large language model |
|
|
| 30 |
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance |
IssueBench:构建大规模真实提示数据集,用于评估LLM写作辅助中的议题偏见 |
large language model |
|
|
| 31 |
MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection |
构建多标签阿拉伯语数据集MultiProSE,用于宣传、情感和情绪检测 |
large language model |
|
|
| 32 |
ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation |
ParetoRAG:利用句子-上下文注意力提升检索增强生成系统的鲁棒性和效率 |
large language model |
|
|