cs.LG(2023-12-13)

📊 共 4 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (3 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
1 An Invitation to Deep Reinforcement Learning 深度强化学习入门教程:面向非可微目标和时序问题的通用优化框架 reinforcement learning deep reinforcement learning PPO
2 Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF 针对RLHF中隐藏上下文问题,提出分布式的偏好学习方法DPL,提升模型鲁棒性。 reinforcement learning preference learning RLHF
3 World Models via Policy-Guided Trajectory Diffusion 提出PolyGRAD:一种基于策略引导轨迹扩散的非自回归世界模型 reinforcement learning world model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
4 CBQ: Cross-Block Quantization for Large Language Models 提出CBQ:一种跨块量化方法,用于高效压缩大型语言模型。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页