| 1 |
Instruction-Following Evaluation of Large Vision-Language Models |
研究表明视觉语言大模型微调后指令遵循能力下降,并提出改进方案 |
large language model instruction following |
|
|
| 2 |
Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process |
提出LLM-PeerReview,通过同行评审集成大语言模型,提升生成质量。 |
large language model |
|
|
| 3 |
An Empirical Analysis of Fine-Tuning Large Language Models on Bioinformatics Literature: PRSGPT and BioStarsGPT |
提出生物信息学领域LLM微调流程,构建PRSGPT和BioStarsGPT。 |
large language model |
|
|
| 4 |
ClinDEF: A Dynamic Evaluation Framework for Large Language Models in Clinical Reasoning |
ClinDEF:用于评估大型语言模型临床推理能力的动态评估框架 |
large language model |
|
|
| 5 |
A Stepwise-Enhanced Reasoning Framework for Large Language Models Based on External Subgraph Generation |
提出基于外部子图生成的逐步增强推理框架SGR,提升大语言模型在复杂推理任务上的性能。 |
large language model |
|
|
| 6 |
Reservoir Computing inspired Matrix Multiplication-free Language Model |
提出基于储备池计算的无矩阵乘法语言模型,降低训练和推理成本。 |
large language model |
|
|
| 7 |
Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning |
提出熵感知推测解码EASD,提升LLM推理能力并超越目标模型自身性能 |
large language model |
|
|
| 8 |
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents |
综述:AI Agent中借鉴认知神经科学的记忆系统设计 |
multimodal |
|
|
| 9 |
Eliciting Behaviors in Multi-Turn Conversations |
提出多轮对话行为诱导框架,提升LLM测试用例生成效率 |
large language model |
|
|
| 10 |
Anka: A Domain-Specific Language for Reliable LLM Code Generation |
提出Anka领域特定语言,提升LLM在复杂数据转换任务中的代码生成可靠性。 |
large language model |
|
|
| 11 |
Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing |
多语言隐藏提示注入攻击影响LLM学术评审,不同语言脆弱性差异显著 |
large language model |
|
|
| 12 |
Discovering Multi-Scale Semantic Structure in Text Corpora Using Density-Based Trees and LLM Embeddings |
提出基于密度树和LLM嵌入的多尺度文本语义结构发现方法 |
large language model |
|
|
| 13 |
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing |
InfTool:通过多智能体角色扮演合成无限工具使用数据,提升LLM工具调用能力。 |
large language model |
|
|
| 14 |
Marriage Discourse on Chinese Social Media: An LLM-assisted Analysis |
利用LLM分析中国社交媒体上的婚姻讨论,揭示情感与道德因素对结婚意愿的影响 |
large language model |
|
|
| 15 |
Single LLM Debate, MoLaCE: Mixture of Latent Concept Experts Against Confirmation Bias |
提出MoLaCE,通过混合潜在概念专家解决LLM中的确认偏差问题 |
large language model |
|
|
| 16 |
Entropy-Guided Token Dropout: Training Autoregressive Language Models with Limited Domain Data |
提出EntroDrop,通过熵引导的token dropout解决领域数据受限时自回归语言模型过拟合问题 |
large language model |
|
|
| 17 |
AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration |
提出AI4Reading,一个基于多智能体协作的中文有声书解读系统 |
large language model |
|
|