| 1 |
Training Large Language Models to Reason in a Continuous Latent Space |
提出Coconut:利用连续隐空间进行LLM推理,提升逻辑推理能力。 |
large language model chain-of-thought |
|
|
| 2 |
OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions |
OmniEvalKit:用于评估大语言模型及其全能扩展的模块化轻量级工具箱 |
large language model multimodal |
|
|
| 3 |
Anchoring Bias in Large Language Models: An Experimental Study |
揭示大语言模型中的锚定偏差并验证现有缓解策略的局限性 |
large language model chain-of-thought |
|
|
| 4 |
PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models |
构建PediaBench中文儿科数据集,用于评估大型语言模型在儿科问答任务中的性能。 |
large language model instruction following |
✅ |
|
| 5 |
Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models |
提出基于GPT-4的多任务学习框架,提升文本分类与摘要生成性能 |
large language model multimodal |
|
|
| 6 |
Assessing the Impact of Conspiracy Theories Using Large Language Models |
利用大型语言模型评估阴谋论的影响力,揭示模型偏差与评估策略。 |
large language model |
|
|
| 7 |
Towards Controllable Speech Synthesis in the Era of Large Language Models: A Systematic Survey |
系统性综述:基于大语言模型的可控语音合成技术 |
large language model |
✅ |
|
| 8 |
Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy |
提出基于LLM多样性的数据质量增强方法,提升文本分类性能并加速训练。 |
large language model |
|
|
| 9 |
The Rosetta Paradox: Domain-Specific Performance Inversions in Large Language Models |
揭示大型语言模型在领域知识上的“罗塞塔悖论”现象 |
large language model |
|
|
| 10 |
LLM-BIP: Structured Pruning for Large Language Models with Block-Wise Forward Importance Propagation |
LLM-BIP:基于块级前向重要性传播的大语言模型结构化剪枝 |
large language model |
|
|
| 11 |
Political-LLM: Large Language Models in Political Science |
Political-LLM:构建LLM在政治科学应用的综合框架,推动领域发展 |
large language model |
|
|
| 12 |
A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension |
通过固有维度比较大型语言模型中监督微调和上下文学习的表征差异 |
large language model |
|
|
| 13 |
Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance |
利用LLM融合语音和文本模态提升心理健康诊断,在抑郁症和PTSD检测中表现出潜力。 |
large language model multimodal |
|
|
| 14 |
AutoReason: Automatic Few-Shot Reasoning Decomposition |
提出AutoReason,自动生成Few-Shot推理分解,提升LLM在问答任务中的推理能力。 |
large language model chain-of-thought |
✅ |
|
| 15 |
Asynchronous LLM Function Calling |
提出AsyncLM以解决LLM函数调用的同步限制问题 |
large language model |
|
|
| 16 |
SuperMerge: An Approach For Gradient-Based Model Merging |
提出SuperMerge,一种基于梯度的模型合并方法,用于解决任务增量场景下的模型更新问题。 |
large language model |
|
|
| 17 |
JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM |
JAPAGEN:利用LLM生成日语训练数据,实现高效的少样本/零样本学习 |
large language model |
|
|
| 18 |
Efficient VoIP Communications through LLM-based Real-Time Speech Reconstruction and Call Prioritization for Emergency Services |
提出基于LLM的实时语音重建与呼叫优先级排序方法,提升紧急服务VoIP通信效率 |
large language model |
|
|
| 19 |
Frontier AI systems have surpassed the self-replicating red line |
Llama3-70B-Instruct和Qwen2-72B-Instruct模型已突破自我复制红线 |
large language model |
|
|
| 20 |
Small Languages, Big Models: A Study of Continual Training on Languages of Norway |
提出三阶段持续训练方法,提升挪威语等小语种大模型性能与效率 |
large language model |
|
|
| 21 |
Evaluating LLM-based Approaches to Legal Citation Prediction: Domain-specific Pre-training, Fine-tuning, or RAG? A Benchmark and an Australian Law Case Study |
提出AusLaw Citation Benchmark,评估LLM在法律引用预测中的应用,并探索领域预训练、微调和RAG方法。 |
large language model |
|
|
| 22 |
Generative Adversarial Reviews: When LLMs Become the Critic |
提出GAR:利用LLM驱动的智能体模拟同行评审,提升科研反馈效率与公平性。 |
large language model |
|
|