cs.LG(2024-09-18)
📊 共 13 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (5 🔗1)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Fine-Tuning a Time Series Foundation Model with Wasserstein Loss | 提出Wasserstein损失微调时间序列基础模型,显著提升点估计精度 | large language model foundation model | ||
| 2 | User-friendly Foundation Model Adapters for Multivariate Time Series Classification | 提出面向多元时间序列分类的轻量级适配器,提升基础模型易用性 | foundation model | ||
| 3 | Extracting Memorized Training Data via Decomposition | 提出一种基于分解查询的方法,从大型语言模型中提取记忆的训练数据,揭示潜在安全风险。 | large language model | ||
| 4 | Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis | 结合LLM代码生成、形式化规范与反应式程序合成,提升复杂系统代码生成质量。 | large language model | ||
| 5 | All-in-one foundational models learning across quantum chemical levels | 提出AIO-ANI模型,实现跨量子化学等级的统一机器学习势函数建模 | multimodal | ✅ | |
| 6 | Less Memory Means smaller GPUs: Backpropagation with Compressed Activations | 提出压缩激活的反向传播方法,降低GPU内存占用,实现更小GPU上的深度学习训练。 | large language model | ||
| 7 | Mixture of Diverse Size Experts | 提出MoDSE:一种混合不同规模专家的MoE架构,提升LLM性能。 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | Reward-Robust RLHF in LLMs | 提出Reward-Robust RLHF框架,提升LLM在不完美奖励模型下的对齐稳定性和性能。 | reinforcement learning RLHF large language model | ||
| 9 | Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning | 提出基于可学习约束的ATACOM扩展方法,解决安全强化学习中的长期安全和不确定性问题 | reinforcement learning | ||
| 10 | Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning | 聚焦数据质量的离线多智能体强化学习方法,提升算法泛化性 | reinforcement learning | ||
| 11 | Reinforcement Learning as an Improvement Heuristic for Real-World Production Scheduling | 提出基于强化学习的改进启发式算法,解决实际生产调度中的多目标优化问题 | reinforcement learning | ||
| 12 | HARP: Human-Assisted Regrouping with Permutation Invariant Critic for Multi-Agent Reinforcement Learning | 提出HARP框架,通过人机协作重组解决多智能体强化学习中的分组任务。 | reinforcement learning | ✅ |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | SplitVAEs: Decentralized scenario generation from siloed data for stochastic optimization problems | 提出SplitVAEs,解决数据孤岛下随机优化问题中的去中心化场景生成难题 | spatiotemporal |