cs.LG(2024-09-29)

📊 共 9 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5) 支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 The Crucial Role of Samplers in Online Direct Preference Optimization 在线直接偏好优化中采样器的关键作用:提出在线采样器实现二次收敛 RLHF DPO direct preference optimization
2 Calibrating Language Models with Adaptive Temperature Scaling 提出自适应温度缩放(ATS)方法,提升RLHF微调后大语言模型的校准性能。 reinforcement learning RLHF large language model
3 Adaptive Event-triggered Reinforcement Learning Control for Complex Nonlinear Systems 提出自适应事件触发强化学习控制,用于复杂非线性系统。 reinforcement learning
4 Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation 提出结合方向正则化与知识蒸馏的定制化联邦学习算法,解决客户端异构性难题。 distillation
5 Constrained Reinforcement Learning for Safe Heat Pump Control 提出I4B建筑模拟器,并应用CSAC-LB算法实现安全节能的热泵控制 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
6 Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method 提出基于提示组合的联邦学习方法,提升视觉-语言模型的泛化性和个性化 foundation model
7 Hyper-Connections 提出超连接(Hyper-Connections)方法,替代残差连接并提升大语言模型和视觉任务性能。 large language model
8 Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data 论证成员推断攻击无法有效证明模型基于特定数据训练 foundation model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
9 Unifying back-propagation and forward-forward algorithms through model predictive control 提出基于模型预测控制的统一框架,融合反向传播和前向-前向算法 MPC model predictive control

⬅️ 返回 cs.LG 首页 · 🏠 返回主页