cs.LG(2026-02-22)

📊 共 13 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (7) 支柱九:具身大模型 (Embodied Foundation Models) (5) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)

#题目一句话要点标签🔗
1 Stable Deep Reinforcement Learning via Isotropic Gaussian Representations 提出基于各向同性高斯表示的稳定深度强化学习方法,提升非平稳环境下的性能。 reinforcement learning deep reinforcement learning
2 AdsorbFlow: energy-conditioned flow matching enables fast and realistic adsorbate placement AdsorbFlow:能量条件流匹配实现快速逼真的吸附质放置 flow matching classifier-free guidance
3 LLMs Can Learn to Reason Via Off-Policy RL 提出OAPL算法,解决LLM离策略强化学习中训练与推理策略差异问题。 reinforcement learning PPO large language model
4 Soft Sequence Policy Optimization: Bridging GMPO and SAPO 提出软序列策略优化以解决策略训练稳定性问题 reinforcement learning PPO large language model
5 How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization DynaMO:针对LLM推理,优化Rollout分配与优势调制的策略优化框架 reinforcement learning large language model
6 Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning 提出基于生成强化学习的反向光刻方法,突破传统ILT的局部最优限制。 reinforcement learning
7 Learning to Detect Language Model Training Data via Active Reconstruction 提出主动数据重构攻击以解决LLM训练数据检测问题 reinforcement learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
8 TimeRadar: A Domain-Rotatable Foundation Model for Time Series Anomaly Detection TimeRadar:一种用于时间序列异常检测的域可旋转基础模型 foundation model
9 Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning 提出PROSPER算法,解决多目标偏好微调中传递性缺失问题。 large language model instruction following
10 Smooth Gate Functions for Soft Advantage Policy Optimization 提出平滑门函数优化Soft Advantage Policy Optimization,提升LLM数学推理能力 large language model
11 Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations 揭示语言模型程序性幻觉:注意力缺陷导致推理后结果遗忘 large language model
12 Understanding Empirical Unlearning with Combinatorial Interpretability 利用组合可解释性理解经验性模型遗忘中的知识残留问题 foundation model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
13 An Interpretable Data-Driven Model of the Flight Dynamics of Hawks 提出基于动态模态分解的鹰类飞行动力学可解释数据驱动模型 locomotion

⬅️ 返回 cs.LG 首页 · 🏠 返回主页