cs.LG（2024-09-29）

📊 共 9 篇论文

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (5) 支柱九：具身大模型 (Embodied Foundation Models) (3) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
1	The Crucial Role of Samplers in Online Direct Preference Optimization	在线直接偏好优化中采样器的关键作用：提出在线采样器实现二次收敛	RLHF DPO direct preference optimization
2	Calibrating Language Models with Adaptive Temperature Scaling	提出自适应温度缩放(ATS)方法，提升RLHF微调后大语言模型的校准性能。	reinforcement learning RLHF large language model
3	Adaptive Event-triggered Reinforcement Learning Control for Complex Nonlinear Systems	提出自适应事件触发强化学习控制，用于复杂非线性系统。	reinforcement learning
4	Tailored Federated Learning: Leveraging Direction Regulation & Knowledge Distillation	提出结合方向正则化与知识蒸馏的定制化联邦学习算法，解决客户端异构性难题。	distillation
5	Constrained Reinforcement Learning for Safe Heat Pump Control	提出I4B建筑模拟器，并应用CSAC-LB算法实现安全节能的热泵控制	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method	提出基于提示组合的联邦学习方法，提升视觉-语言模型的泛化性和个性化	foundation model
7	Hyper-Connections	提出超连接（Hyper-Connections）方法，替代残差连接并提升大语言模型和视觉任务性能。	large language model
8	Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data	论证成员推断攻击无法有效证明模型基于特定数据训练	foundation model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
9	Unifying back-propagation and forward-forward algorithms through model predictive control	提出基于模型预测控制的统一框架，融合反向传播和前向-前向算法	MPC model predictive control

⬅️ 返回 cs.LG 首页 · 🏠 返回主页