| 1 |
Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales |
提出对称强化学习损失,增强RL在多样任务和模型规模下的鲁棒性 |
reinforcement learning PPO RLHF |
|
|
| 2 |
FedHPL: Efficient Heterogeneous Federated Learning with Prompt Tuning and Logit Distillation |
FedHPL:基于Prompt Tuning和Logit蒸馏的高效异构联邦学习框架 |
distillation foundation model |
|
|
| 3 |
Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges |
提出线性函数近似的NPG方法,加速解决低维强化学习问题 |
reinforcement learning PPO |
|
|
| 4 |
Partial Models for Building Adaptive Model-Based Reinforcement Learning Agents |
提出基于局部模型的自适应模型强化学习方法,提升环境局部变化适应性 |
reinforcement learning dreamer |
|
|
| 5 |
A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning |
提出SADA:一种通用的视觉强化学习数据增强方法,提升训练稳定性和泛化性 |
reinforcement learning |
✅ |
|
| 6 |
Finding Shared Decodable Concepts and their Negations in the Brain |
提出基于对比学习和聚类的脑活动解码方法,发现大脑中共享的可解码概念及其否定概念。 |
contrastive learning multimodal |
|
|
| 7 |
Spectral regularization for adversarially-robust representation learning |
提出谱正则化方法,提升表征学习的对抗鲁棒性,尤其适用于自监督学习。 |
representation learning |
|
|
| 8 |
SMR: State Memory Replay for Long Sequence Modeling |
提出状态记忆回放机制SMR,解决SSM长序列建模中的非稳定状态问题 |
Mamba SSM state space model |
|
|
| 9 |
How Do the Architecture and Optimizer Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks |
研究架构和优化器如何影响深度神经网络表征学习的训练动态 |
representation learning |
|
|
| 10 |
Opinion-Guided Reinforcement Learning |
提出意见引导的强化学习方法,利用不确定性意见提升智能体学习效率。 |
reinforcement learning |
|
|
| 11 |
Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning |
提出一种自适应内在动机的无监督强化学习方法,提升智能体在不同熵环境下的学习能力。 |
reinforcement learning |
|
|
| 12 |
Oracle-Efficient Reinforcement Learning for Max Value Ensembles |
提出一种高效强化学习算法,通过最大值集成策略提升已有策略性能 |
reinforcement learning |
|
|