cs.LG(2024-09-18)

📊 共 13 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Fine-Tuning a Time Series Foundation Model with Wasserstein Loss 提出Wasserstein损失微调时间序列基础模型,显著提升点估计精度 large language model foundation model
2 User-friendly Foundation Model Adapters for Multivariate Time Series Classification 提出面向多元时间序列分类的轻量级适配器,提升基础模型易用性 foundation model
3 Extracting Memorized Training Data via Decomposition 提出一种基于分解查询的方法,从大型语言模型中提取记忆的训练数据,揭示潜在安全风险。 large language model
4 Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis 结合LLM代码生成、形式化规范与反应式程序合成,提升复杂系统代码生成质量。 large language model
5 All-in-one foundational models learning across quantum chemical levels 提出AIO-ANI模型,实现跨量子化学等级的统一机器学习势函数建模 multimodal
6 Less Memory Means smaller GPUs: Backpropagation with Compressed Activations 提出压缩激活的反向传播方法,降低GPU内存占用,实现更小GPU上的深度学习训练。 large language model
7 Mixture of Diverse Size Experts 提出MoDSE:一种混合不同规模专家的MoE架构,提升LLM性能。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
8 Reward-Robust RLHF in LLMs 提出Reward-Robust RLHF框架,提升LLM在不完美奖励模型下的对齐稳定性和性能。 reinforcement learning RLHF large language model
9 Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning 提出基于可学习约束的ATACOM扩展方法,解决安全强化学习中的长期安全和不确定性问题 reinforcement learning
10 Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning 聚焦数据质量的离线多智能体强化学习方法,提升算法泛化性 reinforcement learning
11 Reinforcement Learning as an Improvement Heuristic for Real-World Production Scheduling 提出基于强化学习的改进启发式算法,解决实际生产调度中的多目标优化问题 reinforcement learning
12 HARP: Human-Assisted Regrouping with Permutation Invariant Critic for Multi-Agent Reinforcement Learning 提出HARP框架,通过人机协作重组解决多智能体强化学习中的分组任务。 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
13 SplitVAEs: Decentralized scenario generation from siloed data for stochastic optimization problems 提出SplitVAEs,解决数据孤岛下随机优化问题中的去中心化场景生成难题 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页