| 1 |
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers |
利用强化学习梯度优化在线微调决策Transformer,提升低奖励数据预训练模型的性能。 |
reinforcement learning TD3 offline reinforcement learning |
|
|
| 2 |
SambaMixer: State of Health Prediction of Li-ion Batteries using Mamba State Space Models |
提出SambaMixer以预测锂离子电池的健康状态 |
Mamba SSM state space model |
|
|
| 3 |
An Information Criterion for Controlled Disentanglement of Multimodal Data |
提出DisentangledSSL,用于多模态数据中可控的解耦表征学习。 |
representation learning multimodal |
✅ |
|
| 4 |
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment |
提出基于任务对齐的逆强化学习框架,提升复杂环境与迁移学习性能 |
reinforcement learning imitation learning inverse reinforcement learning |
|
|
| 5 |
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning |
提出风险感知偏好强化学习算法RA-PbRL以解决AI安全问题 |
reinforcement learning RLHF large language model |
✅ |
|
| 6 |
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization |
提出EARL-BO,利用强化学习解决高维贝叶斯优化中的多步前瞻问题。 |
reinforcement learning policy learning |
|
|
| 7 |
Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning |
提出基于组合自动机嵌入的目标条件强化学习方法 |
reinforcement learning |
|
|
| 8 |
Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning |
提出一种模型无关的元学习安全强化学习框架,通过渐进式安全保障提升安全性。 |
reinforcement learning |
|
|
| 9 |
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs |
提出局部线性化方法以解决连续MDP中的无悔强化学习问题 |
reinforcement learning |
|
|
| 10 |
Noise as a Double-Edged Sword: Reinforcement Learning Exploits Randomized Defenses in Neural Networks |
研究噪声对强化学习攻击的影响,提出更精细的防御策略 |
reinforcement learning |
|
|
| 11 |
Enhancing Chess Reinforcement Learning with Graph Representation |
提出基于图表示的强化学习方法,提升国际象棋AI的泛化性和训练效率。 |
reinforcement learning |
✅ |
|
| 12 |
A Non-Monolithic Policy Approach of Offline-to-Online Reinforcement Learning |
提出非单体策略的离线-在线强化学习方法,提升在线策略学习效率 |
reinforcement learning |
|
|
| 13 |
VecCity: A Taxonomy-guided Library for Map Entity Representation Learning |
VecCity:一个用于地图实体表示学习的分类引导库,旨在统一评估和促进模型复用。 |
representation learning |
✅ |
|
| 14 |
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI |
提出基于多目标强化学习的自适应对齐框架,动态调整AI以适应多元用户偏好。 |
reinforcement learning |
|
|
| 15 |
How Do Flow Matching Models Memorize and Generalize in Sample Data Subspaces? |
提出流匹配模型以解决样本数据子空间中的记忆与泛化问题 |
flow matching |
|
|
| 16 |
Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models |
提出基于扩散模型的解耦表示学习方法,提升隐变量单元的可解释性和独立性。 |
DRL representation learning |
|
|
| 17 |
Dynamical similarity analysis can identify compositional dynamics developing in RNNs |
提出动态相似性分析(DSA),用于识别RNN中组合动态的学习过程,优于现有方法。 |
Mamba state space model |
|
|
| 18 |
Maximum Entropy Hindsight Experience Replay |
提出最大熵后见之明经验回放(MaxEnt-HER)算法,提升目标导向强化学习PPO算法性能 |
reinforcement learning PPO |
|
|
| 19 |
Deterministic Exploration via Stationary Bellman Error Maximization |
提出基于平稳贝尔曼误差最大化的确定性探索方法,提升强化学习探索效率 |
reinforcement learning policy learning |
|
|
| 20 |
CALE: Continuous Arcade Learning Environment |
提出CALE:扩展ALE以支持连续动作控制的街机学习环境 |
PPO SAC |
✅ |
|