| 1 |
MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN |
提出基于多智能体深度强化学习的跨域组播路由方法MA-CDMR,解决多域SDWN中的NP难问题。 |
reinforcement learning deep reinforcement learning |
|
|
| 2 |
Enhancing Analogical Reasoning in the Abstraction and Reasoning Corpus via Model-Based RL |
基于模型的强化学习提升抽象与推理语料库中的类比推理能力 |
reinforcement learning dreamer model-based RL |
|
|
| 3 |
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models |
提出双因子偏好优化(BFPO),在保证安全性的前提下提升语言模型的有用性。 |
reinforcement learning RLHF large language model |
✅ |
|
| 4 |
Earth Observation Satellite Scheduling with Graph Neural Networks and Monte Carlo Tree Search |
提出基于图神经网络和蒙特卡洛树搜索的地球观测卫星调度方法 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 5 |
Learning Robust Reward Machines from Noisy Labels |
PROB-IRM:从噪声标签中学习鲁棒奖励机,提升强化学习智能体性能 |
reinforcement learning policy learning reward shaping |
|
|
| 6 |
CL4KGE: A Curriculum Learning Method for Knowledge Graph Embedding |
CL4KGE:一种基于课程学习的知识图谱嵌入方法,提升模型训练效果。 |
curriculum learning |
|
|
| 7 |
On Stateful Value Factorization in Multi-Agent Reinforcement Learning |
提出DuelMIX,通过状态值分解提升多智能体强化学习性能 |
reinforcement learning |
|
|