cs.LG(2026-02-08)

📊 共 21 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗2) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 Efficient Anti-exploration via VQVAE and Fuzzy Clustering in Offline Reinforcement Learning 提出基于VQVAE和模糊聚类的离线强化学习反探索方法,提升效率和性能。 reinforcement learning policy learning offline RL
2 Epigraph-Guided Flow Matching for Safe and Performant Offline Reinforcement Learning EpiFlow:基于Epigraph引导的流匹配,实现安全且高效的离线强化学习 reinforcement learning offline RL offline reinforcement learning
3 MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation 提出MARTI-MARS$^2$,通过强化学习扩展多智能体自搜索,提升代码生成能力 reinforcement learning policy learning large language model
4 Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models 提出Horizon Imagination,加速扩散世界模型在强化学习中的在线Rollout。 reinforcement learning world model
5 Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection 提出OGPSA,通过正交梯度投影缓解安全对齐中的能力遗忘问题 DPO direct preference optimization large language model
6 Preference Conditioned Multi-Objective Reinforcement Learning: Decomposed, Diversity-Driven Policy Optimization D³PO:一种分解式、多样性驱动的偏好条件多目标强化学习方法 reinforcement learning PPO
7 When Is Compositional Reasoning Learnable from Verifiable Rewards? 通过可验证奖励学习组合推理:任务优势比是关键 reinforcement learning large language model
8 A Kinetic-Energy Perspective of Flow Matching 提出基于动能视角的Flow Matching方法,提升生成模型质量并减少记忆化 flow matching
9 Interpretable Analytic Calabi-Yau Metrics via Symbolic Distillation 通过符号提炼获得可解释的解析Calabi-Yau度量 distillation
10 rePIRL: Learn PRM with Inverse RL for LLM Reasoning rePIRL:通过逆强化学习为LLM推理学习过程奖励模型 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
11 TerraBind: Fast and Accurate Binding Affinity Prediction through Coarse Structural Representations TerraBind:通过粗粒度结构表征实现快速准确的蛋白-配体结合亲和力预测 foundation model multimodal
12 Multimodal normative modeling in Alzheimers Disease with introspective variational autoencoders 提出mmSIVAE模型,提升阿尔茨海默病多模态规范建模的参考分布拟合和模态融合效果。 multimodal
13 Adaptive Acquisition Selection for Bayesian Optimization with Large Language Models 提出LMABO,利用大语言模型自适应选择贝叶斯优化中的采集函数 large language model
14 CausalTAD: Injecting Causal Knowledge into Large Language Models for Tabular Anomaly Detection CausalTAD:将因果知识注入大语言模型用于表格异常检测 large language model
15 The Rise of Sparse Mixture-of-Experts: A Survey from Algorithmic Foundations to Decentralized Architectures and Vertical Domain Applications 稀疏混合专家模型综述:从算法基础到去中心化架构与垂直领域应用 large language model multimodal
16 Efficient and Adaptable Detection of Malicious LLM Prompts via Bootstrap Aggregation 提出BAGEL,通过自举聚合高效检测恶意LLM提示词 large language model
17 Implicit Strategic Optimization: Rethinking Long-Horizon Decision-Making in Adversarial Poker Environments 提出隐式策略优化ISO,解决对抗扑克环境中长程决策的策略外部性问题 large language model
18 From $O(mn)$ to $O(r^2)$: Two-Sided Low-Rank Communication for Adam in Distributed Training with Memory Efficiency 提出TSR-Adam,通过双边低秩通信显著降低分布式训练中的通信开销。 foundation model
19 CausalArmor: Efficient Indirect Prompt Injection Guardrails via Causal Attribution CausalArmor:利用因果归因实现高效的间接提示注入防御 chain-of-thought

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
20 FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability-Plasticity Tradeoff 提出FIRE:通过Frobenius-Isometry重初始化平衡稳定性-可塑性权衡 humanoid reinforcement learning SAC

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
21 TAAM:Inductive Graph-Class Incremental Learning with Task-Aware Adaptive Modulation 提出TAAM,通过任务感知自适应调制解决图分类增量学习中的灾难性遗忘问题。 AMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页