cs.LG(2026-05-01)

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 AlphaInventory: Evolving White-Box Inventory Policies via Large Language Models with Deployment Guarantees AlphaInventory:利用大语言模型演化具有部署保证的白盒库存策略 reinforcement learning large language model
2 A Policy-Driven DRL Framework for System-Level Tradeoff Control in NR-U/Wi-Fi Coexistence 提出策略驱动的DRL框架,用于NR-U/Wi-Fi共存系统级权衡控制 reinforcement learning deep reinforcement learning DRL
3 ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning ResRL:通过负样本投影残差强化学习提升LLM推理能力 reinforcement learning large language model
4 Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning 提出增强拉格朗日乘子网络(ALaM),解决强化学习中状态安全约束下的训练不稳定问题。 reinforcement learning SAC
5 Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning Odysseus:通过强化学习将视觉语言模型扩展到游戏中100+步决策 reinforcement learning PPO
6 Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation 提出迷你批量风险度量以解决风险厌恶的马尔可夫决策问题 reinforcement learning
7 Model-Based Reinforcement Learning with Double Oracle Efficiency in Policy Optimization and Offline Estimation 提出一种双重Oracle高效的强化学习算法以解决大规模环境中的计算瓶颈问题 reinforcement learning
8 Binomial flows: Denoising and flow matching for discrete ordinal data 提出二项流模型,解决离散序数数据的去噪和流匹配问题 flow matching
9 Free Energy Surface Sampling via Reduced Flow Matching 提出FES-FM方法,通过约简流匹配实现高效自由能面采样 flow matching
10 SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control SAVGO:基于余弦相似度的状态-动作价值几何学习,用于连续控制 reinforcement learning representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
11 Learning Multimodal Energy-Based Model with Multimodal Variational Auto-Encoder via MCMC Revision 提出基于MCMC修正的多模态变分自编码器能量模型,提升多模态数据生成质量。 multimodal
12 Hypergraph and Latent ODE Learning for Multimodal Root Cause Localization in Microservices HyperODE RCA:结合超图、隐ODE与多模态融合的微服务根因定位方法 multimodal
13 Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge Tempus:面向Versal AI Edge的资源不变时序GEMM流式框架 large language model
14 Generating Statistical Charts with Validation-Driven LLM Workflows 提出基于验证驱动的LLM工作流,用于生成高质量统计图表并构建图表问答数据集。 multimodal
15 RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution RunAgent:提出一种基于约束引导执行的自然语言计划解释框架,提升工作流执行的可靠性。 large language model
16 Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game 提出建筑推理能力评估方法以解决大型语言模型的推理能力不确定性问题 large language model
17 Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance 提出Stable-GFlowNet,通过对比轨迹平衡实现更稳定和多样的大语言模型红队测试。 large language model
18 BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs BWLA:突破LLM的W1A后训练量化壁垒,实现1比特权重和低比特激活 large language model
19 Rethinking LLM Ensembling from the Perspective of Mixture Models 提出基于混合模型的LLM集成方法ME,显著提升推理效率。 large language model
20 Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration 提出群体认知学习(GCL),通过可控的两阶段Agent协作,提升多模态融合性能。 multimodal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
21 Borrowed Geometry: Computational Reuse of Frozen Text-Pretrained Transformer Weights Across Modalities 提出冻结文本预训练变换器权重的跨模态重用方法 manipulation decision transformer
22 Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values 提出基于K-Shapley值的BCMAB-FBF公平算法,解决预算约束组合多臂老虎机中的精英公平性问题。 dual-arm

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
23 PILIR: Physics-Informed Local Implicit Representation 提出PILIR,通过局部隐式表达缓解PINN中的谱偏置问题,提升高频细节学习能力。 implicit representation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
24 VQ-SAD: Vector Quantized Structure Aware Diffusion For Molecule Generation 提出VQ-SAD:一种基于向量量化结构感知扩散的分子生成方法 VQ-VAE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页