cs.LG(2026-02-20)
📊 共 20 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (10 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1)
支柱一:机器人控制 (Robot Control) (2)
支柱八:物理动画 (Physics-based Animation) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning | 提出FINO,通过注入噪声的Flow Matching提升离线到在线强化学习的样本效率 | reinforcement learning offline RL flow matching | ||
| 2 | Deep Reinforcement Learning for Optimizing Energy Consumption in Smart Grid Systems | 利用物理信息神经网络加速智能电网能量优化中的深度强化学习 | reinforcement learning deep reinforcement learning policy learning | ||
| 3 | Flow Actor-Critic for Offline Reinforcement Learning | 提出Flow Actor-Critic,利用流模型解决离线强化学习中复杂数据分布问题 | reinforcement learning offline RL offline reinforcement learning | ||
| 4 | Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning | 提出基于记忆的优势塑造方法,提升LLM引导强化学习的样本效率 | reinforcement learning large language model | ||
| 5 | MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance | MIRA:一种利用有限LLM指导的记忆集成强化学习Agent,解决稀疏奖励问题。 | reinforcement learning large language model | ✅ | |
| 6 | Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards | 提出梯度正则化方法,解决RLHF和RLVR中奖励函数漏洞利用问题 | reinforcement learning RLHF | ||
| 7 | Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models | 提出基于双仿真的联合嵌入预测世界模型,提升规划在视觉干扰下的鲁棒性 | world model | ||
| 8 | On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction | 研究Proto-Tokens中编码的语义和句法信息,探索单步文本重建的非自回归路径。 | distillation large language model | ||
| 9 | Balancing Symmetry and Efficiency in Graph Flow Matching | 提出一种可控对称性调制方案,在图生成模型中平衡对称性和效率。 | flow matching | ||
| 10 | Learning Optimal and Sample-Efficient Decision Policies with Guarantees | 针对高风险决策,提出一种具有保证的、样本高效的强化学习策略学习方法。 | reinforcement learning imitation learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Analyzing and Improving Chain-of-Thought Monitorability Through Information Theory | 通过信息论分析与改进思维链的可监控性,提升LLM安全性 | chain-of-thought | ||
| 12 | Non-Interfering Weight Fields: Treating Model Parameters as a Continuously Extensible Function | 提出非干涉权重场(NIWF),解决大模型灾难性遗忘问题。 | large language model instruction following | ||
| 13 | MapTab: Can MLLMs Master Constrained Route Planning? | MapTab:评估多模态大语言模型在约束条件下的路线规划能力 | large language model multimodal | ||
| 14 | Continual-NExT: A Unified Comprehension And Generation Continual Learning Framework | 提出Continual-NExT框架,解决多模态大语言模型持续学习难题。 | large language model multimodal | ||
| 15 | Large Causal Models for Temporal Causal Discovery | 提出用于时序因果发现的大型因果模型,提升泛化性和可扩展性 | foundation model | ✅ | |
| 16 | Quantum Maximum Likelihood Prediction via Hilbert Space Embeddings | 提出基于希尔伯特空间嵌入的量子最大似然预测框架,用于统一处理经典和量子LLM。 | large language model | ||
| 17 | [Re] Benchmarking LLM Capabilities in Negotiation through Scoreable Games | 复现与扩展LLM谈判能力基准测试,揭示模型评估的客观性挑战 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 18 | Whole-Brain Connectomic Graph Model Enables Whole-Body Locomotion Control in Fruit Fly | 提出FlyGM:基于果蝇全脑连接组图模型的具身运动控制方法 | locomotion reinforcement learning | ||
| 19 | Online decoding of rat self-paced locomotion speed from EEG using recurrent neural networks | 利用循环神经网络从脑电信号在线解码大鼠自主运动速度 | locomotion |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 20 | Stable Long-Horizon Spatiotemporal Prediction on Meshes Using Latent Multiscale Recurrent Graph Neural Networks | 提出基于潜在多尺度递归图神经网络的稳定长时间预测方法 | spatiotemporal |