cs.LG(2025-10-05)
📊 共 7 篇论文
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱二:RL算法与架构 (RL & Architecture) (2)
支柱一:机器人控制 (Robot Control) (1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks | DoRAN:通过噪声注入和辅助网络稳定权重分解低秩适应 | foundation model | ||
| 2 | Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models | 对比扩散模型与自回归语言模型:性能特征分析与优化策略 | large language model | ||
| 3 | What Scales in Cross-Entropy Scaling Law? | 揭示交叉熵缩放定律失效原因:仅误差熵具有鲁棒缩放性 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning | SFPO:面向LLM推理,通过重定位-更新机制提升强化学习训练效率与稳定性 | reinforcement learning large language model | ||
| 5 | Simple Policy Gradients for Reasoning with Diffusion Language Models | 提出AGRPO算法,用于扩散语言模型推理的策略梯度优化 | reinforcement learning large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | A KL-regularization framework for learning to plan with adaptive priors | 提出PO-MPC框架,通过KL正则化学习自适应先验的规划策略,提升MBRL性能。 | MPC model predictive control reinforcement learning |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention | 提出Wave-PDE Nets,以可训练波动方程层替代注意力机制,提升计算效率。 | differentiable simulation |