cs.LG(2026-05-29)

📊 共 47 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (27 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (15 🔗1) 支柱一:机器人控制 (Robot Control) (4 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (27 篇)

#题目一句话要点标签🔗
1 Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings 提出CHARM,利用多模态JEPA学习语义时序嵌入,提升异构时序数据建模能力。 JEPA Joint-Embedding Predictive Architecture joint-embedding predictive architecture
2 Subspace-Decomposed JEPAs: Disentangling Progression and Content in Latent World Models 提出SD-JEPA以解决任务进展与内容编码分离问题 world model world models JEPA
3 A Lecture Note on Offline RL and IRL, Part II: Foundations of Inverse Reinforcement Learning and Dynamic Discrete Choice Models 离线强化学习与逆强化学习综述:统一动态离散选择模型与熵正则化逆强化学习。 reinforcement learning offline RL inverse reinforcement learning
4 Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach 提出可行奖励集方法以解决逆强化学习中的演示者不完美问题 reinforcement learning inverse reinforcement learning large language model
5 GlucoFM: A Dual-Stream Foundation Model for Continuous Glucose Monitoring GlucoFM:用于连续血糖监测的双流基础模型,提升代谢预测性能。 representation learning foundation model
6 Student Capacity Moderates Knowledge Distillation Effectiveness: A Systematic Study Across ResNet Teacher-Student Pairs on CIFAR-10 研究学生网络容量对ResNet图像分类知识蒸馏效果的影响,揭示容量匹配的重要性。 teacher-student distillation
7 Effective Biological Representation Learning by Masking Gene Expression TxFM:通过掩码基因表达实现有效的生物表征学习 representation learning foundation model
8 The Terminal Representation in Reinforcement Learning 提出终端表征(TR),一种无需特征分解且低维度的强化学习状态表征方法。 reinforcement learning representation learning reward shaping
9 EchoRL: Reinforcement Learning via Rollout Echoing EchoRL:通过回声式Rollout增强强化学习,解决奖励退化问题。 reinforcement learning large language model
10 UniRTL: Unifying Code and Graph for Robust RTL Representation Learning UniRTL:融合代码与图结构的鲁棒RTL表示学习框架 representation learning multimodal
11 Automating Formal Verification with Reinforcement Learning and Recursive Inference 利用强化学习和递归推理自动化形式化验证程序生成与证明 reinforcement learning large language model
12 Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning 提出线性滤波器,解决部分可观测强化学习中线性循环记忆网络的理论有效性问题 reinforcement learning policy learning
13 Multivariate Distributional Reinforcement Learning Using Sliced Divergences 提出基于切片散度的多元分布强化学习方法,解决高维回报分布建模难题 reinforcement learning DRL
14 Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning 提出两时间尺度马尔可夫随机逼近以解决强化学习中的收敛问题 reinforcement learning policy learning
15 The Challenges of Using Reinforcement Learning for Controlling Industrial Energy Systems 针对工业能源系统控制,分析强化学习在现实部署中的挑战 reinforcement learning reward design
16 Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences 提出联邦变分偏好对齐框架以解决用户偏好冲突问题 preference learning RLHF large language model
17 Skill Reuse as Compression in Agentic RL 提出ReuseRL,通过技能复用压缩提升Agentic RL的泛化能力 reinforcement learning large language model
18 DRIFT: Decoupled Rollouts and Importance-Weighted Fine-Tuning for Efficient Multi-Turn Optimization DRIFT:解耦Rollout与重要性加权微调,提升多轮交互优化效率 reinforcement learning large language model
19 Generalized Intention Modeling in Multi-Agent Reinforcement Learning 提出任务自适应的混合意图建模框架,提升多智能体强化学习性能 reinforcement learning
20 Trust-Region Behavior Blending for On-Policy Distillation 提出Trust-Region Behavior Blending,提升On-policy蒸馏的早期训练效果 distillation
21 De-attribute to Forget for LLM Unlearning 提出DareU框架,通过数据归因奖励的强化学习实现LLM的有效解学习。 reinforcement learning large language model
22 DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning DARTS:面向LLM强化学习,通过分布感知的主动Rollout轨迹塑造加速训练 reinforcement learning
23 Efficient and Uncertainty-Aware Diffusion Framework for Offline-to-Online Reinforcement Learning DUAL:高效且具有不确定性感知的扩散框架,用于离线到在线强化学习 reinforcement learning
24 When are LLMs Sufficient Policy Optimizers for Sequential RL Tasks? PromptPO:利用LLM作为黑盒优化器解决序列RL任务 reinforcement learning large language model
25 Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization 研究Transformer注意力头学习动态,揭示位置编码与符号推理的泛化能力差异 world model world models
26 Memory by Design: Probabilistic Sequence Layers 提出设计模型框架,通过显式记忆假设推导高效循环序列映射。 Mamba linear attention
27 Learning Hyperspherical Time-Frequency Representations for Time-Series Out-of-Distribution Detection 提出基于超球面时间-频率表征的时间序列分布外检测方法 representation learning contrastive learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
28 When Are Multimodal Predictions Biologically Supported? A Diagnostic Evaluation Framework DECAT:肿瘤多模态预测生物学合理性诊断评估框架 foundation model multimodal
29 Geometry-based Schrödinger Bridges for Trustworthy Multimodal Fusion 提出基于几何的Schrödinger桥多模态融合方法,提升系统在低质量数据下的鲁棒性。 multimodal
30 Best-Arm Identification-Based Trust Region Selection for Bayesian Optimization on Multimodal Functions 提出基于最佳臂识别的信赖域选择贝叶斯优化方法,用于解决多峰函数优化问题 multimodal
31 Chain-of-Thought and Compressed Looped Transformers: A Memory-Budget Separation 对比思维链与循环Transformer,揭示记忆预算对模型推理能力的限制 chain-of-thought
32 Diversity Matters: Revisiting Test-Time Compute in Vision-Language Models 提出基于熵的测试时计算方法,提升视觉-语言模型集成性能 large language model
33 The Dynamic-Probabilistic Consistency Gap in Chaotic Surrogate Modeling 提出KAFFEE框架,解决混沌系统代理模型中动态-概率一致性差距问题 foundation model
34 Assign and Add: A Mechanistic Study of Compositional Arithmetic 研究Transformer在变量赋值与模块化加法中的组合泛化机制 large language model
35 Balanced LoRA: Removing Parameter Invariance to Accelerate Convergence 提出BaLoRA,通过消除参数不变性加速LoRA收敛,提升微调性能。 large language model
36 Spectral Reach: Understanding Neural Scaling as Progress into the Spectral Tail 提出“谱位置”度量,揭示神经网络规模化训练中谱尾学习机制 foundation model
37 TabCausal: Pretraining Across Causal Environments for Tabular Causal Discovery TabCausal:通过跨因果环境预训练提升表格数据因果发现性能。 foundation model
38 Free energy Estimation on Any State Space 提出广义神经传输学习方法,解决任意状态空间上的自由能估计问题 multimodal
39 HetCCL: Enabling Collective Communication For Mixed-Vendor Heterogeneous Clusters HetCCL:为混合异构集群实现高效的集合通信 large language model
40 Eigenvectors of Experts are Training-free Non-collapsing Routers 提出SSMoE:一种免训练的专家权重谱分解路由方法,解决SMoE模型专家坍塌问题。 large language model
41 Cross-Layer Subspace Coupling for LLM Compression: A Unifying Framework and Its Empirical Limits 统一LLM压缩框架并揭示其局限性:跨层子空间耦合的再思考 large language model
42 OrcaRouter: A Production-Oriented LLM Router with Hybrid Offline-Online Learning OrcaRouter:一种面向生产环境的混合离线-在线学习LLM路由方法 large language model

🔬 支柱一:机器人控制 (Robot Control) (4 篇)

#题目一句话要点标签🔗
43 Survival Reinforcement Learning: Toward Scalable Self-Supervised RL 提出生存强化学习(SRL),解决对比强化学习在长时程任务中的均匀性容忍困境。 locomotion manipulation reinforcement learning
44 Constrained Multi-Objective Reinforcement Learning with Max-Min Criterion 提出一种带约束的Max-Min多目标强化学习框架,解决公平性与约束满足问题 locomotion reinforcement learning
45 Graphical einops: bridging tensor networks and computation graphs 提出Graphical einops,弥合张量网络与计算图之间的鸿沟 manipulation
46 Unsupervised Diffusion Solver for Combinatorial Optimization via Combinatorial Adjoint Matching 提出组合邻接匹配(CAM),用于无监督求解组合优化问题的扩散模型。 trajectory optimization

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
47 Lightweight CNN-Based Anomaly Detection for High Voltage Converter Modulators in the Spallation Neutron Source 针对散裂中子源高压转换器调制器的轻量级CNN异常检测方法 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页