| 1 |
AQUALLM: Audio Question Answering Data Generation Using Large Language Models |
AQUALLM:利用大型语言模型生成音频问答数据,提升模型泛化性。 |
large language model |
✅ |
|
| 2 |
LLM4Causal: Democratized Causal Tools for Everyone via Large Language Model |
LLM4Causal:通过大语言模型为所有人提供普适的因果推断工具 |
large language model |
|
|
| 3 |
AI Content Self-Detection for Transformer-based Large Language Models |
提出AI内容自检测方法,评估Transformer大语言模型识别自身生成内容的能力 |
large language model |
|
|
| 4 |
Spike No More: Stabilizing the Pre-training of Large Language Models |
稳定大语言模型预训练:通过控制梯度范数避免损失尖峰 |
large language model |
|
|
| 5 |
Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams |
评估大型语言模型在西班牙语本科入学考试中的表现 |
large language model |
|
|
| 6 |
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation |
提出MR-GSM8K基准,用于评估大语言模型的元推理能力 |
large language model |
|
|
| 7 |
MathPile: A Billion-Token-Scale Pretraining Corpus for Math |
提出MathPile:一个十亿级别token规模的数学预训练语料库,提升数学推理能力。 |
foundation model |
|
|
| 8 |
Experiential Co-Learning of Software-Developing Agents |
提出Experiential Co-Learning框架,提升LLM智能体在软件开发中的协同效率。 |
large language model |
✅ |
|
| 9 |
Virtual Scientific Companion for Synchrotron Beamlines: A Prototype |
提出用于同步辐射光束线的虚拟科学助手原型,通过自然语言控制实验。 |
large language model |
|
|
| 10 |
BBScore: A Brownian Bridge Based Metric for Assessing Text Coherence |
提出BBScore以解决文本连贯性评估问题 |
large language model |
|
|
| 11 |
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation |
提出SimulateBench,评估LLM在模拟人类行为时的可信度 |
large language model |
|
|
| 12 |
Structured Packing in LLM Training Improves Long Context Utilization |
提出SPLiCe结构化数据填充方法,提升LLM长文本上下文利用率 |
large language model |
|
|
| 13 |
Length Extrapolation of Transformers: A Survey from the Perspective of Positional Encoding |
综述Transformer长度外推方法,聚焦位置编码视角下的技术方案。 |
large language model |
|
|