cs.LG(2024-09-25)
📊 共 10 篇论文
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱一:机器人控制 (Robot Control) (2)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference | 提出无奖励推断的零阶策略梯度方法以解决RLHF问题 | reinforcement learning PPO RLHF | ||
| 2 | Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous | 提出基于强化学习的多碎片空间任务规划方法,优化交会序列 | reinforcement learning deep reinforcement learning PPO | ||
| 3 | Spiders Based on Anxiety: How Reinforcement Learning Can Deliver Desired User Experience in Virtual Reality Personalized Arachnophobia Treatment | 提出基于强化学习的虚拟现实蜘蛛生成方法,用于个性化恐惧症治疗。 | reinforcement learning | ||
| 4 | Topological Foundations of Reinforcement Learning | 基于拓扑学理论,为强化学习算法设计提供数学基础与效率优化方法 | reinforcement learning | ||
| 5 | Learning Utilities from Demonstrations in Markov Decision Processes | 提出Utility Learning方法,从MDP中的行为演示中学习智能体的风险偏好。 | reinforcement learning inverse reinforcement learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Counterfactual Token Generation in Large Language Models | 提出基于Gumbel-Max SCM的因果token生成方法,增强LLM的反事实推理能力。 | large language model | ||
| 7 | No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha | Medha:通过细粒度抢占式调度解决长上下文LLM推理中的异构性问题 | large language model | ||
| 8 | INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | 提出INT-FlashAttention,实现INT8量化加速FlashAttention推理。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | Accumulator-Aware Post-Training Quantization for Large Language Models | AXE:首个面向大语言模型的累加器感知型后训练量化框架,保障溢出避免。 | manipulation large language model | ||
| 10 | The poison of dimensionality | 提出模型维度对中毒攻击脆弱性的影响分析 | manipulation |