| 1 |
An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination |
提出基于大语言模型的强化学习奖励函数自动设计框架,用于解决车队协同控制问题。 |
reinforcement learning reward design large language model |
|
|
| 2 |
Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models |
提出模块化机器学习框架,提升大语言模型的可解释性与适应性。 |
representation learning large language model |
|
|
| 3 |
Contextures: The Mechanism of Representation Learning |
提出Contexture理论,统一表征学习框架,揭示预训练机制。 |
representation learning foundation model |
|
|
| 4 |
Interactive Double Deep Q-network: Integrating Human Interventions and Evaluative Predictions in Reinforcement Learning of Autonomous Driving |
提出交互式双深度Q网络(iDDQN),融合人类干预提升自动驾驶强化学习性能。 |
reinforcement learning DRL |
|
|
| 5 |
Representation Learning on a Random Lattice |
提出基于随机格子的表征学习模型,提升深度神经网络的可解释性。 |
representation learning |
|
|
| 6 |
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets |
提出基于自动PRM引导的GFlowNets,提升LLM数学推理的准确性和多样性 |
reinforcement learning large language model |
|
|
| 7 |
Rulebook: bringing co-routines to reinforcement learning environments |
提出Rulebook,一种基于协程的领域特定语言,简化强化学习环境构建。 |
reinforcement learning |
|
|
| 8 |
Soft-Label Caching and Sharpening for Communication-Efficient Federated Distillation |
SCARLET:面向通信高效联邦蒸馏的软标签缓存与锐化框架 |
distillation |
✅ |
|
| 9 |
Quantifying Memory Utilization with Effective State-Size |
提出有效状态大小(ESS)以量化序列模型内存利用率,并用于模型优化。 |
distillation large language model |
|
|