cs.LG(2025-10-05)

📊 共 7 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱二:RL算法与架构 (RL & Architecture) (2) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
1 DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks DoRAN:通过噪声注入和辅助网络稳定权重分解低秩适应 foundation model
2 Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models 对比扩散模型与自回归语言模型:性能特征分析与优化策略 large language model
3 What Scales in Cross-Entropy Scaling Law? 揭示交叉熵缩放定律失效原因:仅误差熵具有鲁棒缩放性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
4 Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning SFPO:面向LLM推理,通过重定位-更新机制提升强化学习训练效率与稳定性 reinforcement learning large language model
5 Simple Policy Gradients for Reasoning with Diffusion Language Models 提出AGRPO算法,用于扩散语言模型推理的策略梯度优化 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
6 A KL-regularization framework for learning to plan with adaptive priors 提出PO-MPC框架,通过KL正则化学习自适应先验的规划策略,提升MBRL性能。 MPC model predictive control reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
7 Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention 提出Wave-PDE Nets,以可训练波动方程层替代注意力机制,提升计算效率。 differentiable simulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页