cs.LG(2024-06-12)

📊 共 10 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 MaIL: Improving Imitation Learning with Mamba MaIL:利用Mamba提升模仿学习性能,尤其在小数据集上表现突出 imitation learning Mamba representation learning
2 Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning 提出基于残差学习和上下文编码的自适应离线-在线强化学习方法,解决动态环境适应问题。 reinforcement learning offline reinforcement learning
3 An Empirical Study of Mamba-based Language Models 大规模Mamba语言模型实证研究:性能对比与混合架构探索 Mamba SSM
4 A Critical Look At Tokenwise Reward-Guided Text Generation 提出基于Bradley-Terry奖励模型的token级奖励引导文本生成方法,无需大规模LLM微调。 reinforcement learning RLHF large language model
5 Structured Difference-of-Q via Orthogonal Learning 提出基于正交学习的结构化Q函数差分估计方法,用于离线强化学习策略优化。 reinforcement learning offline reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
6 A Concept-Based Explainability Framework for Large Multimodal Models 提出基于概念学习的大型多模态模型可解释性框架,提升模型内部表征理解。 large language model multimodal
7 Large Language Models Must Be Taught to Know What They Don't Know 通过微调使大语言模型具备认知自身未知的能力,提升高风险场景应用可靠性 large language model
8 Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis 提出Time-MMD多领域多模态时间序列数据集,提升时间序列分析性能。 multimodal
9 QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts QuantMoE-Bench:研究专家混合模型后训练量化的细粒度精度设置 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
10 RILe: Reinforced Imitation Learning 提出RILe框架以高效学习复杂行为 locomotion reinforcement learning imitation learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页