cs.LG(2024-06-05)

📊 共 12 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (6 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (5) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
1 Pre-trained Large Language Models Use Fourier Features to Compute Addition 揭示预训练大语言模型使用傅里叶特征进行加法运算的机制 large language model
2 Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models 提出MoE-F:基于随机滤波的在线门控混合大语言模型,提升时间序列预测精度。 large language model
3 Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models Pruner-Zero:从零进化大语言模型符号剪枝指标,无需人工干预。 large language model
4 Does your data spark joy? Performance gains from domain upsampling at the end of training 提出领域数据末端上采样方法,提升大语言模型在特定任务上的性能 large language model
5 Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers 研究Transformer中前馈层与注意力层在知识存储和推理中的作用差异 large language model
6 PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs PrE-Text:一种在联邦学习中利用差分隐私合成数据训练LLM的方法 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
7 Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms 揭示直接对齐算法中奖励模型过度优化问题及其规模效应 reinforcement learning RLHF direct preference optimization
8 Inductive Generalization in Reinforcement Learning from Specifications 提出一种新颖的归纳泛化框架以解决强化学习中的逻辑规范问题 reinforcement learning
9 Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling 提出统一的PAC-Bayes框架,用于分析正则化重要性采样的离线策略学习中的悲观算法。 policy learning
10 Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning HesScale:一种高效可扩展的Hessian对角近似方法,提升强化学习性能 reinforcement learning
11 Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning 提出基于量化的细粒度因果动力学学习方法,提升强化学习的鲁棒性 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
12 Quantifying Task Priority for Multi-Task Optimization 提出基于连接强度的多任务学习优化方法,解决任务间负迁移问题。 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页