cs.LG(2024-07-25)
📊 共 20 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (10)
支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1)
支柱五:交互与反应 (Interaction & Reaction) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Advanced deep-reinforcement-learning methods for flow control: group-invariant and positional-encoding networks improve learning speed and quality | 提出结合群不变网络与位置编码的深度强化学习方法,加速并提升流动控制性能。 | reinforcement learning deep reinforcement learning DRL | ||
| 2 | Recursive Introspection: Teaching Language Model Agents How to Self-Improve | 提出RISE:通过递归自省提升语言模型在复杂推理任务中的自我改进能力 | reinforcement learning imitation learning large language model | ||
| 3 | Multi-Agent Deep Reinforcement Learning for Resilience Optimization in 5G RAN | 提出基于多智能体深度强化学习的5G RAN弹性优化方案 | reinforcement learning deep reinforcement learning | ||
| 4 | Adversarially Robust Decision Transformer | 提出ARDT,通过学习最坏情况回报提升决策Transformer在对抗环境中的鲁棒性 | reinforcement learning decision transformer | ||
| 5 | Your Graph Recommender is Provably a Single-view Graph Contrastive Learning | 揭示图推荐器本质:等价于单视图图对比学习模型 | representation learning contrastive learning | ||
| 6 | Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts | 提出基于合约的主体-代理强化学习框架,协调AI个体利益与社会福利 | reinforcement learning | ||
| 7 | How to Train the Teacher Model for Effective Knowledge Distillation | 提出使用MSE训练教师模型以提升知识蒸馏效果,最高提升2.6%。 | distillation | ||
| 8 | Peak-Controlled Logits Poisoning Attack in Federated Distillation | 提出PCFDLA以解决联邦蒸馏中的投毒攻击问题 | distillation | ||
| 9 | Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation | 提出基于熵优势估计的最大熵On-Policy Actor-Critic算法,提升强化学习性能。 | reinforcement learning PPO | ||
| 10 | Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization | 提出最优Hessian/Jacobian-Free算法HJFBiO,高效解决非凸PL双层优化问题 | reinforcement learning representation learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | Privacy-Preserving Hierarchical Model-Distributed Inference | 提出privateMDI,用于保护隐私的分层模型分布式推理加速。 | OMOMO |