cs.LG(2025-09-01)
📊 共 5 篇论文
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱一:机器人控制 (Robot Control) (1)
支柱二:RL算法与架构 (RL & Architecture) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | REFINESTAT: Efficient Exploration for Probabilistic Program Synthesis | 提出RefineStat以解决概率程序合成中的语义约束问题 | large language model | ||
| 2 | MatPROV: A Provenance Graph Dataset of Material Synthesis Extracted from Scientific Literature | 提出MatPROV以解决材料合成过程结构复杂性问题 | large language model | ||
| 3 | CARE: Decoding Time Safety Alignment via Rollback and Introspection Intervention | 提出CARE框架,通过回滚与自省干预提升LLM解码时安全性,兼顾质量与效率。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | Learning to Coordinate: Distributed Meta-Trajectory Optimization Via Differentiable ADMM-DDP | 提出L2C框架,通过可微ADMM-DDP实现分布式元轨迹优化,解决多智能体协同问题。 | manipulation trajectory optimization |
🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Towards High Data Efficiency in Reinforcement Learning with Verifiable Reward | DEPO:面向可验证奖励强化学习的高数据效率策略优化 | reinforcement learning |