| 1 |
GITSR: Graph Interaction Transformer-based Scene Representation for Multi Vehicle Collaborative Decision-making |
提出基于图交互Transformer的场景表示框架GITSR,用于多车协同决策。 |
reinforcement learning interaction transformer |
|
|
| 2 |
Sample-Efficient Alignment for LLMs |
提出SEA算法,通过上下文决斗强盗框架实现LLM高效对齐 |
preference learning RLHF DPO |
|
|
| 3 |
Learning World Models for Unconstrained Goal Navigation |
提出MUN算法,解决无约束目标导航中世界模型泛化性问题。 |
reinforcement learning world model |
|
|
| 4 |
Learning Hidden Subgoals under Temporal Ordering Constraints in Reinforcement Learning |
提出LSTOC算法,解决强化学习中时序约束下学习隐藏子目标的问题 |
reinforcement learning contrastive learning |
|
|
| 5 |
Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning |
提出聚类边缘探索算法CE²,提升目标条件强化学习在未知环境中的探索效率 |
reinforcement learning |
|
|
| 6 |
Decoupling Dark Knowledge via Block-wise Logit Distillation for Feature-level Alignment |
提出块级Logit蒸馏框架,通过隐式特征对齐提升知识蒸馏性能 |
distillation |
|
|