| 1 |
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought |
提出In-context Decision Transformer,通过分层思维链加速离线强化学习。 |
reinforcement learning offline reinforcement learning decision transformer |
|
|
| 2 |
Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling |
提出Decision Mamba-Hybrid,结合Transformer和Mamba优势,提升强化学习长时序决策效率。 |
reinforcement learning decision transformer Mamba |
|
|
| 3 |
Generative AI for Deep Reinforcement Learning: Framework, Analysis, and Use Cases |
提出GAI增强的DRL框架,提升DRL在复杂环境下的样本效率和泛化能力 |
reinforcement learning deep reinforcement learning DRL |
✅ |
|
| 4 |
Mamba State-Space Models Are Lyapunov-Stable Learners |
Mamba状态空间模型:Lyapunov稳定性保障下的稳健学习 |
Mamba SSM large language model |
|
|
| 5 |
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF |
提出XPO算法,通过隐式Q*-近似实现RLHF中的高效探索偏好优化。 |
reinforcement learning RLHF DPO |
|
|
| 6 |
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning |
提出Diffusion Actor-Critic,通过扩散噪声回归解决离线强化学习中的策略约束问题 |
reinforcement learning offline reinforcement learning |
✅ |
|
| 7 |
Amortizing intractable inference in diffusion models for vision, language, and control |
提出相对轨迹平衡以解决扩散模型后验推断问题 |
reinforcement learning deep reinforcement learning offline reinforcement learning |
|
|
| 8 |
LInK: Learning Joint Representations of Design and Performance Spaces through Contrastive Learning for Mechanism Synthesis |
LInK:通过对比学习设计与性能空间联合表示,用于机构综合 |
contrastive learning multimodal |
✅ |
|
| 9 |
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality |
通过结构化状态空间对偶性,统一Transformer和SSM,并提出高效算法。 |
Mamba SSM |
|
|
| 10 |
Bayesian Design Principles for Offline-to-Online Reinforcement Learning |
提出基于贝叶斯设计的离线到在线强化学习方法,解决策略优化中的悲观/乐观困境。 |
reinforcement learning offline reinforcement learning |
|
|
| 11 |
Flow matching achieves almost minimax optimal convergence |
提出流匹配方法以实现几乎最优收敛性 |
flow matching |
|
|
| 12 |
Reinforcement Learning for Sociohydrology |
提出基于强化学习的社会水文学框架,解决土地利用管理中的径流控制问题 |
reinforcement learning |
|
|
| 13 |
Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation |
提出MIPE以解决抗体-抗原结合位点预测问题 |
contrastive learning |
|
|
| 14 |
Heterophilous Distribution Propagation for Graph Neural Networks |
提出异质性分布传播(HDP)图神经网络,解决异质图中的节点表征学习问题。 |
representation learning contrastive learning |
|
|