cs.LG(2024-09-29)
📊 共 9 篇论文
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | The Crucial Role of Samplers in Online Direct Preference Optimization | 在线直接偏好优化中采样器的关键作用:提出在线采样器实现二次收敛 | RLHF DPO direct preference optimization | ||
| 2 | Calibrating Language Models with Adaptive Temperature Scaling | 提出自适应温度缩放(ATS)方法,提升RLHF微调后大语言模型的校准性能。 | reinforcement learning RLHF large language model | ||
| 3 | Adaptive Event-triggered Reinforcement Learning Control for Complex Nonlinear Systems | 提出自适应事件触发强化学习控制,用于复杂非线性系统。 | reinforcement learning | ||
| 4 | Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation | 提出结合方向正则化与知识蒸馏的定制化联邦学习算法,解决客户端异构性难题。 | distillation | ||
| 5 | Constrained Reinforcement Learning for Safe Heat Pump Control | 提出I4B建筑模拟器,并应用CSAC-LB算法实现安全节能的热泵控制 | reinforcement learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method | 提出基于提示组合的联邦学习方法,提升视觉-语言模型的泛化性和个性化 | foundation model | ||
| 7 | Hyper-Connections | 提出超连接(Hyper-Connections)方法,替代残差连接并提升大语言模型和视觉任务性能。 | large language model | ||
| 8 | Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data | 论证成员推断攻击无法有效证明模型基于特定数据训练 | foundation model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | Unifying back-propagation and forward-forward algorithms through model predictive control | 提出基于模型预测控制的统一框架,融合反向传播和前向-前向算法 | MPC model predictive control |