cs.LG(2025-10-04)

📊 共 7 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5) 支柱九:具身大模型 (Embodied Foundation Models) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning 提出Token Hidden Reward,用于在群体相对深度强化学习中引导探索-利用。 reinforcement learning deep reinforcement learning large language model
2 Deep Reinforcement Learning for Multi-Agent Coordination 提出基于虚拟信息素的S-MADRL框架,解决拥挤环境中多智能体高效协作问题 reinforcement learning deep reinforcement learning curriculum learning
3 Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration 提出RAPO算法,通过强化学习探索提升LLM在复杂推理任务中的能力 reinforcement learning large language model
4 HOFLON: Hybrid Offline Learning and Online Optimization for Process Start-Up and Grade-Transition Control 提出HOFLON,结合离线学习与在线优化,提升流程启动和产品切换控制性能。 reinforcement learning offline RL offline reinforcement learning
5 Distributed Area Coverage with High Altitude Balloons Using Multi-Agent Reinforcement Learning 提出基于多智能体强化学习的高空气球分布式区域覆盖方法 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
6 Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation 提出IniLoRA,通过优化低秩适应的初始化策略提升微调性能 large language model
7 Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders 揭示稀疏自编码器可解释性与操控效用间的差距,提出Delta Token Confidence特征选择方法。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页