| 1 |
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey |
综述:集成学习提升大语言模型在文本和代码生成中的性能 |
large language model multimodal |
|
|
| 2 |
Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models |
提出CS-ReFT,通过组合子空间表示微调自适应大语言模型,解决多任务学习中的技能冲突问题。 |
large language model instruction following |
|
|
| 3 |
Cognitive-Mental-LLM: Evaluating Reasoning in Large Language Models for Mental Health Prediction via Online Text |
利用思维链LLM提升在线文本心理健康预测的推理能力 |
large language model chain-of-thought |
|
|
| 4 |
DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation |
DynaCode:动态复杂性感知代码基准,用于评估代码生成大语言模型 |
large language model |
✅ |
|
| 5 |
DarkBench: Benchmarking Dark Patterns in Large Language Models |
DarkBench:构建大型语言模型中暗黑模式的综合评测基准 |
large language model |
|
|
| 6 |
Word-level Annotation of GDPR Transparency Compliance in Privacy Policies using Large Language Models |
提出基于LLM的模块化流程,用于隐私政策中GDPR透明度合规性的词级别标注。 |
large language model |
|
|
| 7 |
SCE: Scalable Consistency Ensembles Make Blackbox Large Language Model Generation More Reliable |
提出可扩展一致性集成(SCE)框架,提升黑盒大语言模型生成可靠性 |
large language model |
|
|
| 8 |
MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation |
提出MMLU-ProX多语言基准,用于全面评估大型语言模型的跨语言推理能力。 |
large language model |
|
|
| 9 |
Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents |
提出基于源语言引导的多轮对话方法,提升大语言模型文档翻译质量 |
large language model |
|
|
| 10 |
New Trends for Modern Machine Translation with Large Reasoning Models |
利用大型推理模型,将机器翻译重构为动态推理任务,提升翻译质量。 |
multimodal chain-of-thought |
|
|
| 11 |
"Well, Keep Thinking": Enhancing LLM Reasoning with Adaptive Injection Decoding |
提出自适应注入解码,无需显式提示增强LLM推理能力 |
large language model chain-of-thought |
|
|
| 12 |
Why Prompt Design Matters and Works: A Complexity Analysis of Prompt Search Space in LLMs |
提出基于复杂性分析的Prompt设计理论框架,提升LLM推理能力 |
large language model chain-of-thought |
|
|
| 13 |
Information Density Principle for MLLM Benchmarks |
提出信息密度原则,用于评估和改进多模态大语言模型评测基准。 |
large language model multimodal |
✅ |
|
| 14 |
Scalable Evaluation of Online Facilitation Strategies via Synthetic Simulation of Discussions |
提出基于LLM的在线讨论模拟框架,用于大规模评估在线引导策略。 |
large language model |
|
|
| 15 |
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding |
Gumiho:一种混合架构,通过优先处理推测解码中的早期token来加速LLM推理。 |
large language model |
|
|
| 16 |
OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses |
OASST-ETC:利用眼动追踪分析LLM响应,提供对齐信号 |
large language model |
|
|
| 17 |
NeurIPS 2023 LLM Efficiency Fine-tuning Competition |
NeurIPS 2023 LLM微调竞赛揭示基准数据集过度拟合问题,强调数据清洗的重要性。 |
large language model |
|
|
| 18 |
G-Boost: Boosting Private SLMs with General LLMs |
G-Boost:利用通用LLM提升私有SLM性能的协同推理框架 |
large language model |
|
|
| 19 |
Retrieval-Augmented Generation with Hierarchical Knowledge |
HiRAG:利用层级知识增强检索增强生成,提升领域任务性能 |
large language model |
|
|
| 20 |
ZSMerge: Zero-Shot KV Cache Compression for Memory-Efficient Long-Context LLMs |
ZSMerge:面向长文本LLM的零样本KV缓存压缩,提升内存效率 |
large language model |
✅ |
|
| 21 |
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs? |
研究揭示LLM在歧义消解中过度依赖世界知识和偏见,缺乏人类的灵活性 |
large language model |
|
|
| 22 |
Thinking Machines: A Survey of LLM based Reasoning Strategies |
综述:基于LLM的推理策略研究,弥合语言能力与推理能力差距 |
large language model |
|
|
| 23 |
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set |
提出统一标签集并探究LLM在跨语言篇章泛化能力,揭示中间层的重要性。 |
large language model |
|
|
| 24 |
Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation |
提出自适应内部语音-文本对齐方法,提升基于LLM的语音翻译性能 |
large language model |
|
|