cs.LG(2026-03-13)

📊 共 28 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗5) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱一:机器人控制 (Robot Control) (4) 支柱八:物理动画 (Physics-based Animation) (3)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels LeWorldModel:提出一种稳定的端到端像素级联合嵌入预测架构,用于学习世界模型。 world model world models JEPA
2 Representation Learning for Spatiotemporal Physical Systems 提出时空物理系统表征学习框架,评估自监督方法在物理参数估计中的有效性 representation learning spatiotemporal
3 Anchored Alignment: Preventing Positional Collapse in Multimodal Recommender Systems 提出AnchorRec,通过锚定对齐解决多模态推荐系统中的位置坍塌问题 representation learning multimodal
4 Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback 提出Swap引导的偏好学习SPL,解决个性化RLHF中的后验坍塌问题 reinforcement learning preference learning RLHF
5 A Multi-task Large Reasoning Model for Molecular Science 提出多任务大模型,融合推理与分子知识,提升分子科学任务性能。 reinforcement learning foundation model chain-of-thought
6 Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs 提出DLDMF以解决参数化PDEs的泛化与时间外推问题 latent dynamics spatiotemporal
7 Maximizing Incremental Information Entropy for Contrastive Learning 提出IE-CL,通过最大化增量信息熵提升对比学习在小批量下的表征学习性能。 representation learning contrastive learning
8 PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses PISmith:基于强化学习的提示注入防御红队评估框架 reinforcement learning
9 Enhanced Drug-drug Interaction Prediction Using Adaptive Knowledge Integration 提出自适应知识融合框架,提升LLM在药物相互作用预测中的准确性 reinforcement learning large language model
10 Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages 提出基于熵引导步选择和逐步优势的强化学习方法,用于扩散语言模型后训练。 reinforcement learning
11 A Spectral Revisit of the Distributional Bellman Operator under the Cramér Metric 在Cramér度量下,论文提出分布贝尔曼算子的谱分析新方法,为DRL提供理论基础。 reinforcement learning DRL

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
12 Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity 提出跨层GPU异构性以降低多模态大语言模型推理成本 large language model multimodal
13 Exact Federated Continual Unlearning for Ridge Heads on Frozen Foundation Models 提出精确联邦持续卸载学习方法,用于冻结基座模型上的岭回归头 foundation model
14 Accelerating materials discovery using foundation model based In-context active learning 提出基于预训练模型的上下文主动学习方法ICAL,加速材料发现。 foundation model
15 TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning TERMINATOR:学习思维链推理中提前停止的最优退出点,减少过度思考。 chain-of-thought
16 Taming the Long Tail: Efficient Item-wise Sharpness-Aware Minimization for LLM-based Recommender Systems 提出EISAM,解决LLM推荐系统中长尾问题,提升尾部物品推荐性能。 large language model instruction following
17 When Drafts Evolve: Speculative Decoding Meets Online Learning 提出OnlineSpec,通过在线学习持续优化草稿模型,加速推测解码。 large language model foundation model
18 BoSS: A Best-of-Strategies Selector as an Oracle for Deep Active Learning 提出BoSS:一种最佳策略选择器,作为深度主动学习的Oracle,提升大规模数据集上的性能。 foundation model
19 Breaking the Tuning Barrier: Zero-Hyperparameters Yield Multi-Corner Analysis Via Learned Priors 提出基于学习先验的零超参数方法,解决电路多角分析的调参难题。 foundation model
20 Design-Specification Tiling for ICL-based CAD Code Generation 提出设计规范平铺(DST)方法,提升ICL在CAD代码生成中的性能 large language model
21 LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing LightMoE:通过专家替换减少MoE模型冗余,实现高效压缩 large language model

🔬 支柱一:机器人控制 (Robot Control) (4 篇)

#题目一句话要点标签🔗
22 PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization PhysMoDPO:基于偏好优化的物理可信的人形运动生成 humanoid humanoid robot whole-body control
23 FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control FastDSAC:释放最大熵RL在高维人形控制中的潜力 humanoid humanoid control reinforcement learning
24 CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning 提出CALF框架以解决分布式强化学习中的通信延迟问题 sim-to-real reinforcement learning
25 Influence Malleability in Linearized Attention: Dual Implications of Non-Convergent NTK Dynamics 揭示线性化注意力机制非收敛NTK动态及其影响可塑性的双重含义 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
26 Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction 提出GICON,通过图神经网络和上下文学习提升时空预测泛化性 spatiotemporal
27 Competition-Aware CPC Forecasting with Near-Market Coverage 提出竞争感知CPC预测方法以解决市场波动问题 spatiotemporal foundation model
28 Adaptive Diffusion Posterior Sampling for Data and Model Fusion of Complex Nonlinear Dynamical Systems 提出自适应扩散后验采样方法,用于复杂非线性动力系统的数据与模型融合。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页