cs.LG（2024-06-05）

📊 共 12 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (6 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (5) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Pre-trained Large Language Models Use Fourier Features to Compute Addition	揭示预训练大语言模型使用傅里叶特征进行加法运算的机制	large language model
2	Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models	提出MoE-F：基于随机滤波的在线门控混合大语言模型，提升时间序列预测精度。	large language model
3	Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models	Pruner-Zero：从零进化大语言模型符号剪枝指标，无需人工干预。	large language model	✅
4	Does your data spark joy? Performance gains from domain upsampling at the end of training	提出领域数据末端上采样方法，提升大语言模型在特定任务上的性能	large language model
5	Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers	研究Transformer中前馈层与注意力层在知识存储和推理中的作用差异	large language model
6	PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs	PrE-Text：一种在联邦学习中利用差分隐私合成数据训练LLM的方法	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
7	Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms	揭示直接对齐算法中奖励模型过度优化问题及其规模效应	reinforcement learning RLHF direct preference optimization
8	Inductive Generalization in Reinforcement Learning from Specifications	提出一种新颖的归纳泛化框架以解决强化学习中的逻辑规范问题	reinforcement learning
9	Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling	提出统一的PAC-Bayes框架，用于分析正则化重要性采样的离线策略学习中的悲观算法。	policy learning
10	Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning	HesScale：一种高效可扩展的Hessian对角近似方法，提升强化学习性能	reinforcement learning
11	Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning	提出基于量化的细粒度因果动力学学习方法，提升强化学习的鲁棒性	reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
12	Quantifying Task Priority for Multi-Task Optimization	提出基于连接强度的多任务学习优化方法，解决任务间负迁移问题。	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页