| 1 |
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning |
提出两种悲观离线强化学习算法,解决线性MDP中风险敏感策略优化问题 |
reinforcement learning offline RL offline reinforcement learning |
|
|
| 2 |
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization |
提出Dr. DPO,通过分布鲁棒优化提升语言模型在噪声数据下的对齐效果 |
DPO direct preference optimization large language model |
✅ |
|
| 3 |
Real-time system optimal traffic routing under uncertainties -- Can physics models boost reinforcement learning? |
TransRL:融合物理模型与强化学习,实现不确定性下的实时系统最优交通路径规划 |
reinforcement learning PPO SAC |
|
|
| 4 |
Advancements in Recommender Systems: A Comprehensive Analysis Based on Data, Algorithms, and Evaluation |
综述性分析推荐系统在数据、算法和评估方面的挑战与未来发展方向 |
reinforcement learning deep reinforcement learning multimodal |
|
|
| 5 |
Disentangled Representation Learning with the Gromov-Monge Gap |
提出基于Gromov-Monge Gap的解耦表示学习方法,提升几何特征保持能力。 |
representation learning |
|
|
| 6 |
Reinforcement Learning of Adaptive Acquisition Policies for Inverse Problems |
提出基于强化学习的自适应采集策略,用于求解逆问题。 |
reinforcement learning |
|
|
| 7 |
Deep-Graph-Sprints: Accelerated Representation Learning in Continuous-Time Dynamic Graphs |
Deep-Graph-Sprints:加速连续时间动态图中的表征学习 |
representation learning |
|
|
| 8 |
Resource Allocation for Twin Maintenance and Computing Task Processing in Digital Twin Vehicular Edge Computing Network |
提出基于多智能体深度强化学习的资源协同调度算法,解决数字孪生车联网边缘计算中的资源分配问题。 |
reinforcement learning deep reinforcement learning |
|
|