| 1 |
Diffusion Policies creating a Trust Region for Offline Reinforcement Learning |
提出DTQL:通过扩散信任域加速离线强化学习,兼顾性能与效率 |
reinforcement learning offline RL offline reinforcement learning |
✅ |
|
| 2 |
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning |
提出水下导航基准测试环境,评估并改进深度强化学习算法 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 3 |
Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models |
提出基于重要性采样的扩散模型离线强化学习方法,提升随机数据下的策略学习效果 |
reinforcement learning offline reinforcement learning world model |
|
|
| 4 |
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning |
提出自适应优势引导策略正则化(A2PR)方法,解决离线强化学习中的过保守问题。 |
reinforcement learning offline reinforcement learning |
✅ |
|
| 5 |
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation |
FoldFlow-2:序列增强的SE(3)-Flow Matching用于条件蛋白质骨架生成 |
flow matching large language model |
|
|
| 6 |
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems |
从理论角度理解LLM驱动的自主系统,并提出改进策略。 |
reinforcement learning imitation learning world model |
|
|
| 7 |
Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness |
利用策略学习中的表现性,解决群体公平性困境 |
policy learning predictive model |
|
|
| 8 |
Preference Alignment with Flow Matching |
提出Preference Flow Matching,用于高效偏好对齐预训练模型 |
reinforcement learning flow matching |
✅ |
|
| 9 |
Transformers and Slot Encoding for Sample Efficient Physical World Modelling |
提出基于Transformer和Slot Encoding的世界建模方法,提升样本效率。 |
world model |
✅ |
|
| 10 |
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents |
提出SleeperNets,一种针对强化学习代理的通用后门投毒攻击方法 |
reinforcement learning |
|
|
| 11 |
FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning |
提出基于表征学习的联邦协作在线监测框架,解决异构数据下的资源分配问题 |
representation learning |
|
|
| 12 |
Hybrid Reinforcement Learning Framework for Mixed-Variable Problems |
提出混合强化学习框架,解决混合变量优化问题 |
reinforcement learning |
|
|
| 13 |
Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints |
提出长度无关的PAC界限以优化深度状态空间模型架构 |
SSM |
|
|
| 14 |
Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation |
提出随机化探索算法以解决多项逻辑函数近似的强化学习问题 |
reinforcement learning |
|
|
| 15 |
Joint Selective State Space Model and Detrending for Robust Time Series Anomaly Detection |
提出结合选择性状态空间模型与解趋势的多阶段时间序列异常检测方法 |
state space model |
|
|
| 16 |
MetaCURL: Non-stationary Concave Utility Reinforcement Learning |
提出MetaCURL算法,解决非平稳MDP中的凹效用强化学习问题 |
reinforcement learning |
|
|
| 17 |
Q-learning as a monotone scheme |
将Q-learning解释为单调格式,分析函数逼近对稳定性的影响 |
reinforcement learning deep reinforcement learning |
|
|