cs.LG(2024-10-23)
📊 共 8 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱九:具身大模型 (Embodied Foundation Models) (2 🔗1)
支柱一:机器人控制 (Robot Control) (1)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models | 提出异步RLHF,加速并优化语言模型离线强化学习训练。 | RLHF DPO large language model | ||
| 2 | Adaptive Segment-level Reward: Bridging the Gap Between Action and Reward Space in Alignment | 提出自适应段落级别奖励,弥合对齐中动作与奖励空间差距 | reinforcement learning large language model | ||
| 3 | Differentially Private Learning Needs Better Model Initialization and Self-Distillation | DPRefine通过改进初始化和自蒸馏提升差分隐私语言模型的效用性 | distillation | ||
| 4 | Identifiable Representation and Model Learning for Latent Dynamic Systems | 针对智能航天器,提出基于可控规范型的可辨识隐变量动态系统学习方法 | latent dynamics |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation | CoreInfer:基于语义的自适应稀疏激活加速大语言模型推理 | large language model | ||
| 6 | MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control | MobileSafetyBench:评估移动设备控制中自主Agent的安全性 | large language model | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors | 提出多模态信息瓶颈模型以提升强化学习样本效率 | locomotion reinforcement learning deep reinforcement learning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | Incremental Learning of Affordances using Markov Logic Networks | 提出MLN-CLA算法,用于机器人环境中物体可供性的增量学习与零样本推理。 | affordance |