| 13 |
MAGE: Multi-scale Autoregressive Generation for Offline Reinforcement Learning |
MAGE:多尺度自回归生成离线强化学习方法,解决长时程稀疏奖励任务 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 14 |
Multi-Objective Reinforcement Learning for Large-Scale Tote Allocation in Human-Robot Collaborative Fulfillment Centers |
提出基于多目标强化学习的大规模货位分配方法,优化人机协作物流中心。 |
reinforcement learning policy learning |
|
|
| 15 |
Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments |
提出面向开放世界的可信自适应智能体基础世界模型 |
reinforcement learning world model |
|
|
| 16 |
Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies |
提出基于参数化策略的离线策略优化方法,扩展至大动作空间 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 17 |
Bridging Dynamics Gaps via Diffusion Schrödinger Bridge for Cross-Domain Reinforcement Learning |
提出基于扩散Schrödinger桥的BDGxRL,解决跨域强化学习中的动态差异问题 |
reinforcement learning policy learning |
|
|
| 18 |
Disentangled Mode-Specific Representations for Tensor Time Series via Contrastive Learning |
提出MoST,通过对比学习解耦张量时间序列的模态特定表示,提升分类与预测精度。 |
representation learning contrastive learning |
✅ |
|
| 19 |
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation |
提出CUDA Agent,通过大规模Agent强化学习生成高性能CUDA内核。 |
reinforcement learning large language model |
|
|
| 20 |
Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning |
提出自适应相关性加权内在奖励(ACWI)框架,提升稀疏奖励强化学习的探索效率。 |
reinforcement learning |
|
|
| 21 |
General Bayesian Policy Learning |
提出通用贝叶斯策略学习框架,解决决策问题中的策略优化问题 |
policy learning |
|
|
| 22 |
Flowette: Flow Matching with Graphette Priors for Graph Generation |
Flowette:结合Graphette先验的Flow Matching图生成模型 |
flow matching |
|
|
| 23 |
InfoNCE Induces Gaussian Distribution |
证明InfoNCE损失诱导对比学习表征呈高斯分布特性 |
representation learning contrastive learning |
|
|