| 1 |
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models |
研究多模态大语言模型参数高效微调方法,适配器表现最佳。 |
large language model multimodal |
✅ |
|
| 2 |
What do MLLMs hear? Examining reasoning with text and sound components in Multimodal Large Language Models |
研究音频多模态大语言模型推理能力,揭示其文本推理在音频分类中的局限性 |
large language model multimodal |
|
|
| 3 |
AICoderEval: Improving AI Domain Code Generation of Large Language Models |
AICoderEval:提升大语言模型在AI领域代码生成能力的数据集与框架 |
large language model multimodal |
✅ |
|
| 4 |
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models |
提出FedLLM-Bench,为联邦学习大语言模型提供真实基准测试平台 |
large language model |
✅ |
|
| 5 |
Transforming Dental Diagnostics with Artificial Intelligence: Advanced Integration of ChatGPT and Large Language Models for Patient Care |
利用ChatGPT和大型语言模型革新牙科诊断,提升患者护理水平 |
large language model |
|
|
| 6 |
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models |
提出SpaRC框架与SpaRP数据集,评估大语言模型在空间推理上的能力。 |
large language model |
|
|
| 7 |
Revisiting Catastrophic Forgetting in Large Language Model Tuning |
研究LLM微调中的灾难性遗忘,提出基于loss landscape平坦化的缓解方法 |
large language model |
|
|
| 8 |
LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model |
LawGPT:一个中文法律知识增强的大型语言模型,专为中文法律应用设计。 |
large language model |
✅ |
|
| 9 |
Are Large Language Models More Empathetic than Humans? |
评估大型语言模型同理心:LLM在同理心回应方面超越人类 |
large language model |
|
|
| 10 |
TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models |
构建TCMD:一个用于评估大型语言模型的中医QA数据集 |
large language model |
|
|
| 11 |
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models |
提出CRiskEval中文风险评估基准数据集,用于评估大型语言模型的潜在风险倾向。 |
large language model |
✅ |
|
| 12 |
Mixture-of-Agents Enhances Large Language Model Capabilities |
提出混合Agent模型(MoA),提升大语言模型在多项任务上的性能 |
large language model |
|
|
| 13 |
Large Language Model-guided Document Selection |
提出基于大语言模型指导的文档选择方法,提升预训练效率。 |
large language model |
|
|
| 14 |
Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models |
探索少样本学习在低资源跨语言摘要中的应用,提升大语言模型性能 |
large language model |
|
|
| 15 |
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs |
提出数学主题树基准(MaTT),用于全面评估大型语言模型(LLMs)的数学推理能力。 |
large language model chain-of-thought |
|
|
| 16 |
Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models |
提出一种基于随机森林知识迁移的LLM训练方法,提升数值数据处理能力。 |
large language model chain-of-thought |
|
|
| 17 |
GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents |
GameBench:提出用于评估LLM智能体战略推理能力的跨领域基准测试 |
large language model chain-of-thought |
|
|
| 18 |
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense |
BAMO团队提出结合微调、CoT和ReConcile的方法,解决SemEval-2024 BRAINTEASER常识推理难题。 |
large language model chain-of-thought |
|
|
| 19 |
Uncertainty Aware Learning for Language Model Alignment |
提出不确定性感知学习UAL,提升LLM在不同任务场景下的对齐效果 |
large language model foundation model |
|
|
| 20 |
DALD: Improving Logits-based Detector without Logits from Black-box LLMs |
DALD:无需源LLM logits,对齐分布提升黑盒LLM文本检测 |
large language model |
|
|
| 21 |
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs |
提出Multi-Head RAG,利用多头注意力解决LLM在多方面问题上的检索增强生成。 |
large language model |
|
|
| 22 |
Scenarios and Approaches for Situated Natural Language Explanations |
提出情境化自然语言解释数据集SBE,评估LLM在不同用户场景下的解释能力。 |
large language model |
|
|
| 23 |
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs |
提出对抗调优框架,增强大型语言模型抵御未知越狱攻击的能力 |
large language model |
|
|
| 24 |
Quantifying Geospatial in the Common Crawl Corpus |
量化Common Crawl语料库中的地理空间信息,为LLM空间推理研究奠定基础 |
large language model |
|
|
| 25 |
SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals |
SelfGoal:提升语言Agent在复杂任务中实现高层目标的能力 |
large language model |
|
|
| 26 |
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search |
提出CHIQ,利用开源LLM增强对话搜索中的查询改写,尤其针对歧义查询。 |
large language model |
✅ |
|
| 27 |
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter |
MEFT:通过稀疏Adapter实现内存高效的大语言模型微调 |
large language model |
✅ |
|
| 28 |
DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization |
提出DiNeR数据集,用于评估组合泛化能力,解决现有数据集的局限性。 |
large language model |
✅ |
|