cs.LG(2024-07-15)
📊 共 6 篇论文
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (2)
支柱九:具身大模型 (Embodied Foundation Models) (2)
支柱一:机器人控制 (Robot Control) (2)
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation | SuperPADL:通过渐进式监督蒸馏扩展语言驱动的物理控制 | reinforcement learning distillation text-to-motion | ||
| 2 | Learning Dynamics of LLM Finetuning | 研究LLM微调的学习动态,揭示幻觉和偏好优化现象,并提出改进对齐方法。 | DPO direct preference optimization large language model |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 3 | Mechanistic interpretability of large language models with applications to the financial services industry | 利用机制可解释性分析大型语言模型在金融服务中的应用,聚焦公平贷款合规性 | large language model | ||
| 4 | LLM Circuit Analyses Are Consistent Across Training and Scale | LLM回路分析在训练和规模扩展中保持一致性,揭示小模型分析对大模型的适用性 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Balancing the Scales: Reinforcement Learning for Fair Classification | 提出基于强化学习的公平分类方法,通过调整奖励函数缓解不平衡数据集中的偏见。 | manipulation reinforcement learning | ||
| 6 | Fast Matrix Multiplications for Lookup Table-Quantized LLMs | FLUTE:加速查找表量化LLM的矩阵乘法,提升推理速度 | manipulation large language model |