| 1 |
QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training |
提出QoQ-Med以解决临床多模态数据推理问题 |
reinforcement learning foundation model multimodal |
✅ |
|
| 2 |
A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning for Any Atlas and Disorder |
提出BrainGFM:基于图的脑图谱基础模型,用于多种脑疾病诊断与脑区划分。 |
masked autoencoder contrastive learning large language model |
✅ |
|
| 3 |
MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning |
MMedAgent-RL:基于强化学习的多智能体协作优化多模态医学推理 |
reinforcement learning curriculum learning multimodal |
|
|
| 4 |
From Rules to Rewards: Reinforcement Learning for Interest Rate Adjustment in DeFi Lending |
应用离线强化学习优化DeFi借贷利率调整 |
reinforcement learning TD3 offline reinforcement learning |
|
|
| 5 |
Adaptive Plane Reformatting for 4D Flow MRI using Deep Reinforcement Learning |
提出AdaPR,一种基于深度强化学习的自适应平面重构方法,用于解决4D流MRI的通用性问题。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 6 |
Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing |
提出Prompt-Tuned LLM增强DRL方法,用于动态O-RAN网络切片资源分配。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 7 |
A New Spatiotemporal Correlation Anomaly Detection Method that Integrates Contrastive Learning and Few-Shot Learning in Wireless Sensor Networks |
提出MTAD-RD模型,解决无线传感器网络时空异常检测中特征提取和样本不平衡问题 |
contrastive learning spatiotemporal |
|
|
| 8 |
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning |
提出非线性注意力机制,加速强化学习中置换不变神经网络的收敛 |
reinforcement learning linear attention |
|
|
| 9 |
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs |
提出RLAE:强化学习辅助的大语言模型集成框架,提升模型性能。 |
reinforcement learning PPO large language model |
|
|
| 10 |
Dynamic Domain Adaptation-Driven Physics-Informed Graph Representation Learning for AC-OPF |
提出DDA-PIGCN,解决AC-OPF中复杂约束建模与时空信息融合难题。 |
representation learning MAE spatiotemporal |
|
|
| 11 |
Optimized Local Updates in Federated Learning via Reinforcement Learning |
提出基于强化学习的联邦学习局部更新优化方法,提升非独立同分布数据下的模型性能。 |
reinforcement learning deep reinforcement learning DRL |
✅ |
|
| 12 |
ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network Slicing |
ORAN-GUIDE:基于RAG的提示学习,用于O-RAN网络切片中LLM增强的强化学习 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 13 |
Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments |
大规模研究揭示行为度量学习在干扰强化学习环境中的作用机制 |
reinforcement learning deep reinforcement learning |
|
|
| 14 |
Reinforcement Learning for Hanabi |
探索强化学习算法在花火游戏中智能体协作策略 |
reinforcement learning deep reinforcement learning |
|
|
| 15 |
CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries |
提出CLARIFY,通过对比偏好学习解决强化学习中模糊查询问题 |
reinforcement learning contrastive learning |
|
|
| 16 |
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn |
提出C-CHAIN方法,通过减少Churn来缓解持续强化学习中的可塑性损失 |
reinforcement learning |
|
|
| 17 |
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs |
提出AutoMixAlign以解决多任务偏好优化问题 |
DPO large language model |
|
|
| 18 |
Comparing Traditional and Reinforcement-Learning Methods for Energy Storage Control |
对比传统方法与强化学习在储能控制中的性能,揭示不同场景下的适用性。 |
reinforcement learning |
|
|