cs.LG(2025-09-15)
📊 共 4 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | SafeDiver: Cooperative AUV-USV Assisted Diver Communication via Multi-agent Reinforcement Learning Approach | SafeDiver:基于多智能体强化学习的水下通信辅助系统 | reinforcement learning multimodal | ||
| 2 | UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning | 提出半在线强化学习UI-S1,提升GUI自动化Agent的多步交互能力 | reinforcement learning offline RL | ✅ | |
| 3 | DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks | 提出Dice对抗鲁棒性蒸馏(DARD),提升模型在对抗攻击下的鲁棒性和标准准确率。 | distillation | ||
| 4 | Deceptive Risk Minimization: Out-of-Distribution Generalization by Deceiving Distribution Shift Detectors | 提出欺骗性风险最小化(DRM)方法,通过欺骗分布偏移检测器实现OOD泛化。 | imitation learning representation learning |