cs.LG(2026-05-15)

📊 共 33 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 Learn Where Outcomes Diverge: Efficient VLA RL via Probabilistic Chunk Masking 提出概率块掩码(PCM)加速VLA强化学习,提升梯度计算效率。 reinforcement learning world model world models
2 DeltaPrompts: Escaping the Zero-Delta Trap in Multimodal Distillation 提出DeltaPrompts,通过主动挖掘高差异性提示,提升多模态蒸馏效果。 teacher-student distillation multimodal
3 Mind Dreamer: Untethering Imagination via Active Latent Intervention on Latent Manifolds 提出Mind Dreamer以解决模型基强化学习中的历史束缚问题 reinforcement learning world model world models
4 Offline Reinforcement Learning with Universal Horizon Models 提出通用视野模型以解决离线强化学习中的长期预测问题 reinforcement learning offline RL offline reinforcement learning
5 Constrained latent state modeling: A unifying perspective on representation learning under competing constraints 提出约束隐状态建模,统一视角审视竞争约束下的表征学习 representation learning multimodal
6 AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs AstraFlow:面向Agentic LLM的数据流强化学习系统 reinforcement learning large language model
7 Multi-Fidelity Flow Matching: Cascaded Refinement of PDE Solutions 提出多重保真度流匹配(MFFM),用于参数化偏微分方程解的级联优化。 flow matching spatiotemporal
8 parallelcbf: A composable safety-filter and auditability framework for tensor-parallel reinforcement learning ParallelCBF:用于张量并行强化学习的可组合安全过滤器与可审计性框架 reinforcement learning behavior cloning
9 BAPR: Bayesian amnesic piecewise-robust reinforcement learning for non-stationary continuous control 提出BAPR,结合贝叶斯在线变化检测与鲁棒集成强化学习,解决非平稳连续控制问题。 reinforcement learning SAC
10 Looped SSMs: Depth-Recurrence and Input Reshaping for Time Series Classification 提出循环状态空间模型以提升时间序列分类性能 SSM state space model
11 MIND: Decoupling Model-Induced Label Noise via Latent Manifold Disentanglement MIND:通过解耦潜在流形来消除模型引入的标签噪声 distillation foundation model
12 Dynamics-Level Watermarking of Flow Matching Models with Random Codes 提出一种基于随机码的流匹配模型动态层水印方法,用于保护生成模型版权。 flow matching
13 A Multi-Layer Cloud-IDS Pipeline with LLM and Adaptive Q-Learning Calibration 提出基于LLM和自适应Q学习校准的多层云IDS流水线,提升云环境安全性。 reinforcement learning large language model
14 Tighter Regret Bounds for Contextual Action-Set Reinforcement Learning 针对上下文动作集强化学习,提出更紧的遗憾界限,提升算法性能。 reinforcement learning
15 Pessimistic Risk-Aware Policy Learning in Contextual Bandits 提出统一分布框架以优化风险感知的离线策略学习 policy learning
16 Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making Ada-Diffuser:用于决策的潜在感知自适应扩散模型,显式建模潜在动态。 policy learning latent dynamics
17 VSPO: Vector-Steered Policy Optimization for Behavioral Control 提出VSPO:通过向量引导策略优化实现语言模型的行为控制 distillation reward shaping

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
18 GOMA: Toward Structure-Driven Multimodal Alignment from a Graph Signal Smoothing Perspective GOMA:提出一种图优化的多模态对齐框架,利用图结构提升冻结多模态嵌入的检索性能。 multimodal
19 Differentiable Mixture-of-Agents Incentivizes Swarm Intelligence of Large Language Models 提出可微混合Agent(DMoA),激励大语言模型涌现群体智能 large language model
20 ITGPT: Generative Pretraining on Irregular Timeseries ITGPT:用于不规则时间序列的生成式预训练模型,无需重采样或插补。 large language model multimodal
21 LoCO: Low-rank Compositional Rotation Fine-tuning 提出LoCO,通过低秩组合正交微调提升参数高效微调的几何结构保持能力 foundation model
22 Entropic Auto-Encoding via Implicit Free-Energy Minimization 提出Entropic Autoencoders,通过隐式自由能最小化缓解VAE中的后验崩溃问题。 multimodal
23 CHoE: Cross-Domain Heterogeneous Graph Prompt Learning via Structure-Conditioned Experts 提出CHoE,通过结构条件专家网络解决跨域异构图Prompt学习问题。 foundation model
24 AOT-POT: Adaptive Operator Transformation for Large-Scale PDE Pre-training 提出AOT-POT,通过自适应算子变换实现大规模PDE预训练,显著提升模型泛化能力。 foundation model
25 SEED: Targeted Data Selection by Weighted Independent Set SEED:通过加权独立集实现有针对性的数据选择,提升模型训练效率与性能。 multimodal
26 IO-SVD: Input-Output Whitened SVD for Adaptive-Rank LLM Compression 提出IO-SVD,通过KL感知双边白化SVD实现自适应秩LLM压缩 large language model
27 STS: Efficient Sparse Attention with Speculative Token Sparsity 提出STS:一种基于推测Token稀疏性的高效稀疏注意力机制,加速LLM推理。 large language model
28 Ghosted Layers: Unconstrained Activation Alignment for Recovering Layer-Pruned LLMs 提出Ghosted Layers以解决层修剪后激活对齐问题 large language model
29 SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference SurvivalPFN:通过上下文贝叶斯推断实现生存分析的泛化预测 foundation model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
30 A Unified Perturbation Framework for Analyzing Leaderboard Stability and Manipulation 提出统一扰动框架,分析并操控大语言模型排行榜的稳定性和鲁棒性 manipulation large language model
31 ADAPT: A Self-Calibrating Proactive Autoscaler for Container Orchestration ADAPT:一种自校准的主动容器编排自动伸缩器 MPC model predictive control

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
32 On the Fragility of Data Attribution When Learning Is Distributed 提出数据归因鲁棒性方法以应对分布式学习中的攻击 latent optimization

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
33 Perforated Neural Networks for Keyword Spotting 提出基于穿孔神经网络的关键词识别方法,在边缘设备上实现高精度和小模型。 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页