cs.LG(2024-09-25)

📊 共 10 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5) 支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference 提出无奖励推断的零阶策略梯度方法以解决RLHF问题 reinforcement learning PPO RLHF
2 Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous 提出基于强化学习的多碎片空间任务规划方法,优化交会序列 reinforcement learning deep reinforcement learning PPO
3 Spiders Based on Anxiety: How Reinforcement Learning Can Deliver Desired User Experience in Virtual Reality Personalized Arachnophobia Treatment 提出基于强化学习的虚拟现实蜘蛛生成方法,用于个性化恐惧症治疗。 reinforcement learning
4 Topological Foundations of Reinforcement Learning 基于拓扑学理论,为强化学习算法设计提供数学基础与效率优化方法 reinforcement learning
5 Learning Utilities from Demonstrations in Markov Decision Processes 提出Utility Learning方法,从MDP中的行为演示中学习智能体的风险偏好。 reinforcement learning inverse reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
6 Counterfactual Token Generation in Large Language Models 提出基于Gumbel-Max SCM的因果token生成方法,增强LLM的反事实推理能力。 large language model
7 No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha Medha:通过细粒度抢占式调度解决长上下文LLM推理中的异构性问题 large language model
8 INT-FlashAttention: Enabling Flash Attention for INT8 Quantization 提出INT-FlashAttention,实现INT8量化加速FlashAttention推理。 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
9 Accumulator-Aware Post-Training Quantization for Large Language Models AXE:首个面向大语言模型的累加器感知型后训练量化框架,保障溢出避免。 manipulation large language model
10 The poison of dimensionality 提出模型维度对中毒攻击脆弱性的影响分析 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页