cs.LG(2024-06-05)
📊 共 12 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (6 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Pre-trained Large Language Models Use Fourier Features to Compute Addition | 揭示预训练大语言模型使用傅里叶特征进行加法运算的机制 | large language model | ||
| 2 | Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models | 提出MoE-F:基于随机滤波的在线门控混合大语言模型,提升时间序列预测精度。 | large language model | ||
| 3 | Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models | Pruner-Zero:从零进化大语言模型符号剪枝指标,无需人工干预。 | large language model | ✅ | |
| 4 | Does your data spark joy? Performance gains from domain upsampling at the end of training | 提出领域数据末端上采样方法,提升大语言模型在特定任务上的性能 | large language model | ||
| 5 | Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers | 研究Transformer中前馈层与注意力层在知识存储和推理中的作用差异 | large language model | ||
| 6 | PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs | PrE-Text:一种在联邦学习中利用差分隐私合成数据训练LLM的方法 | large language model | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms | 揭示直接对齐算法中奖励模型过度优化问题及其规模效应 | reinforcement learning RLHF direct preference optimization | ||
| 8 | Inductive Generalization in Reinforcement Learning from Specifications | 提出一种新颖的归纳泛化框架以解决强化学习中的逻辑规范问题 | reinforcement learning | ||
| 9 | Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling | 提出统一的PAC-Bayes框架,用于分析正则化重要性采样的离线策略学习中的悲观算法。 | policy learning | ||
| 10 | Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning | HesScale:一种高效可扩展的Hessian对角近似方法,提升强化学习性能 | reinforcement learning | ||
| 11 | Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning | 提出基于量化的细粒度因果动力学学习方法,提升强化学习的鲁棒性 | reinforcement learning |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Quantifying Task Priority for Multi-Task Optimization | 提出基于连接强度的多任务学习优化方法,解决任务间负迁移问题。 | manipulation |