cs.LG(2026-02-20)

📊 共 20 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning 提出FINO,通过注入噪声的Flow Matching提升离线到在线强化学习的样本效率 reinforcement learning offline RL flow matching
2 Deep Reinforcement Learning for Optimizing Energy Consumption in Smart Grid Systems 利用物理信息神经网络加速智能电网能量优化中的深度强化学习 reinforcement learning deep reinforcement learning policy learning
3 Flow Actor-Critic for Offline Reinforcement Learning 提出Flow Actor-Critic,利用流模型解决离线强化学习中复杂数据分布问题 reinforcement learning offline RL offline reinforcement learning
4 Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning 提出基于记忆的优势塑造方法,提升LLM引导强化学习的样本效率 reinforcement learning large language model
5 MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance MIRA:一种利用有限LLM指导的记忆集成强化学习Agent,解决稀疏奖励问题。 reinforcement learning large language model
6 Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards 提出梯度正则化方法,解决RLHF和RLVR中奖励函数漏洞利用问题 reinforcement learning RLHF
7 Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models 提出基于双仿真的联合嵌入预测世界模型,提升规划在视觉干扰下的鲁棒性 world model
8 On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction 研究Proto-Tokens中编码的语义和句法信息,探索单步文本重建的非自回归路径。 distillation large language model
9 Balancing Symmetry and Efficiency in Graph Flow Matching 提出一种可控对称性调制方案,在图生成模型中平衡对称性和效率。 flow matching
10 Learning Optimal and Sample-Efficient Decision Policies with Guarantees 针对高风险决策,提出一种具有保证的、样本高效的强化学习策略学习方法。 reinforcement learning imitation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
11 Analyzing and Improving Chain-of-Thought Monitorability Through Information Theory 通过信息论分析与改进思维链的可监控性,提升LLM安全性 chain-of-thought
12 Non-Interfering Weight Fields: Treating Model Parameters as a Continuously Extensible Function 提出非干涉权重场(NIWF),解决大模型灾难性遗忘问题。 large language model instruction following
13 MapTab: Can MLLMs Master Constrained Route Planning? MapTab:评估多模态大语言模型在约束条件下的路线规划能力 large language model multimodal
14 Continual-NExT: A Unified Comprehension And Generation Continual Learning Framework 提出Continual-NExT框架,解决多模态大语言模型持续学习难题。 large language model multimodal
15 Large Causal Models for Temporal Causal Discovery 提出用于时序因果发现的大型因果模型,提升泛化性和可扩展性 foundation model
16 Quantum Maximum Likelihood Prediction via Hilbert Space Embeddings 提出基于希尔伯特空间嵌入的量子最大似然预测框架,用于统一处理经典和量子LLM。 large language model
17 [Re] Benchmarking LLM Capabilities in Negotiation through Scoreable Games 复现与扩展LLM谈判能力基准测试,揭示模型评估的客观性挑战 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
18 Whole-Brain Connectomic Graph Model Enables Whole-Body Locomotion Control in Fruit Fly 提出FlyGM:基于果蝇全脑连接组图模型的具身运动控制方法 locomotion reinforcement learning
19 Online decoding of rat self-paced locomotion speed from EEG using recurrent neural networks 利用循环神经网络从脑电信号在线解码大鼠自主运动速度 locomotion

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
20 Stable Long-Horizon Spatiotemporal Prediction on Meshes Using Latent Multiscale Recurrent Graph Neural Networks 提出基于潜在多尺度递归图神经网络的稳定长时间预测方法 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页