cs.AI(2025-05-12)
📊 共 22 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (14)
支柱二:RL算法与架构 (RL & Architecture) (7 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models | 提出S-GRPO,通过强化学习实现推理模型的早期退出,提升效率和准确率。 | reinforcement learning large language model chain-of-thought | ||
| 16 | Explainable Reinforcement Learning Agents Using World Models | 提出基于世界模型的解释性强化学习方法,提升用户对智能体策略的理解 | reinforcement learning world model model-based RL | ||
| 17 | A Survey on Collaborative Mechanisms Between Large and Small Language Models | 综述LLM与SLM协同机制,探索高效、可定制的边缘AI应用 | distillation embodied AI large language model | ||
| 18 | Online Learning-based Adaptive Beam Switching for 6G Networks: Enhancing Efficiency and Resilience | 针对6G网络,提出基于在线学习的自适应波束切换方法,提升效率和稳定性。 | reinforcement learning deep reinforcement learning DRL | ||
| 19 | Multi-source Plume Tracing via Multi-Agent Reinforcement Learning | 提出基于多智能体强化学习的烟羽追踪算法,用于快速定位多个污染源。 | reinforcement learning | ||
| 20 | Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving | 提出ZeroTIR:通过强化学习训练LLM自主执行代码解决数学问题,并揭示其scaling law | reinforcement learning large language model | ✅ | |
| 21 | Measuring General Intelligence with Generated Games | 提出gg-bench:通过生成游戏评估语言模型的通用智能 | reinforcement learning large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 22 | Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity | Comet:通过预测激活稀疏性加速大语言模型私有推理 | MPC spatiotemporal large language model |