| 1 |
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning |
提出R1-Omni,利用强化学习提升Omni-多模态情感识别的性能与可解释性。 |
reinforcement learning large language model multimodal |
|
|
| 2 |
On a Connection Between Imitation Learning and RLHF |
提出DIL框架,从模仿学习视角统一理解并优化人类反馈强化学习(RLHF)对齐。 |
reinforcement learning imitation learning RLHF |
|
|
| 3 |
Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning |
Impoola-CNN:利用平均池化提升图像深度强化学习性能 |
reinforcement learning deep reinforcement learning |
✅ |
|
| 4 |
Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design |
研究强化学习算法在序贯实验设计中的性能,并探索泛化能力 |
reinforcement learning |
|
|
| 5 |
Spatial Distillation based Distribution Alignment (SDDA) for Cross-Headset EEG Classification |
提出基于空间蒸馏的分布对齐方法SDDA,解决跨脑电设备脑电信号分类难题 |
distillation |
|
|
| 6 |
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning |
提出APPO算法,解决离线偏好强化学习中的保守性难题,实现高效策略优化。 |
reinforcement learning |
|
|
| 7 |
Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration |
提出SSDE,通过细粒度稀疏网络分配和休眠神经元探索,解决持续强化学习中的灾难性遗忘问题。 |
reinforcement learning |
✅ |
|
| 8 |
Multi-Task Reinforcement Learning Enables Parameter Scaling |
多任务强化学习通过参数扩展实现性能提升,挑战复杂架构的必要性 |
reinforcement learning |
|
|
| 9 |
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts |
提出Linear-MoE,结合线性序列建模与混合专家模型,高效训练大规模模型。 |
state space model linear attention |
✅ |
|
| 10 |
Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation |
提出基于转移估计的深度强化学习OOD检测方法,保障部署可靠性 |
reinforcement learning deep reinforcement learning |
|
|