cs.LG(2024-06-12)
📊 共 10 篇论文 | 🔗 4 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5 🔗2)
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | MaIL: Improving Imitation Learning with Mamba | MaIL:利用Mamba提升模仿学习性能,尤其在小数据集上表现突出 | imitation learning Mamba representation learning | ✅ | |
| 2 | Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning | 提出基于残差学习和上下文编码的自适应离线-在线强化学习方法,解决动态环境适应问题。 | reinforcement learning offline reinforcement learning | ||
| 3 | An Empirical Study of Mamba-based Language Models | 大规模Mamba语言模型实证研究:性能对比与混合架构探索 | Mamba SSM | ||
| 4 | A Critical Look At Tokenwise Reward-Guided Text Generation | 提出基于Bradley-Terry奖励模型的token级奖励引导文本生成方法,无需大规模LLM微调。 | reinforcement learning RLHF large language model | ✅ | |
| 5 | Structured Difference-of-Q via Orthogonal Learning | 提出基于正交学习的结构化Q函数差分估计方法,用于离线强化学习策略优化。 | reinforcement learning offline reinforcement learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | A Concept-Based Explainability Framework for Large Multimodal Models | 提出基于概念学习的大型多模态模型可解释性框架,提升模型内部表征理解。 | large language model multimodal | ✅ | |
| 7 | Large Language Models Must Be Taught to Know What They Don't Know | 通过微调使大语言模型具备认知自身未知的能力,提升高风险场景应用可靠性 | large language model | ||
| 8 | Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis | 提出Time-MMD多领域多模态时间序列数据集,提升时间序列分析性能。 | multimodal | ✅ | |
| 9 | QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts | QuantMoE-Bench:研究专家混合模型后训练量化的细粒度精度设置 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | RILe: Reinforced Imitation Learning | 提出RILe框架以高效学习复杂行为 | locomotion reinforcement learning imitation learning |