cs.LG(2026-06-08)

📊 共 33 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (19 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (19 篇)

#题目一句话要点标签🔗
1 C$^3$ache: Accelerating World Action Models with Cross Inference Chunk Cache 提出C$^3$ache以加速世界动作模型推理 world action model world action models vision-language-action
2 Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search 提出世界模型启发的评估器以优化张量程序搜索 world model world models latent dynamics
3 From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model 提出Cox监督蒸馏方法将生存风险转化为语言模型 distillation large language model
4 From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning 提出Thinking-RFT以解决ToM模型中的快捷方式问题 reinforcement learning foundation model multimodal
5 Stabilizing On-Policy Distillation for MLLM Reasoning with Global Normalization 提出全球归一化蒸馏策略优化以解决梯度不稳定问题 reinforcement learning distillation multimodal
6 Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families 提出跨模型系列的在政策蒸馏方法以解决tokenizer限制问题 teacher-student distillation large language model
7 PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment 提出PBSD以解决长时间信用分配问题 reinforcement learning policy learning distillation
8 Rethinking the Divergence Regularization in LLM RL 提出DRPO以解决LLM RL中的信任区域优化问题 reinforcement learning PPO large language model
9 Addressing Market Regime Changes and Heavy-Tailed Returns in Portfolio Optimization via Bayesian VAR and Elliptical Black-Litterman 提出BAVAR-BLED算法以解决投资组合优化中的市场状态变化与重尾收益问题 reinforcement learning deep reinforcement learning DRL
10 A Unifying Lens on Reward Uncertainty in RLHF 提出分布式奖励模型以缓解RLHF中的奖励黑客问题 reinforcement learning RLHF
11 Escaping the KL Agreement Trap in On-Policy Distillation 提出KAT以解决在线策略蒸馏中的低KL一致性陷阱问题 distillation
12 Graph Mamba Operator: A Latent Simulator for Interacting Particle Systems 提出Graph Mamba Operator以解决粒子系统建模问题 Mamba
13 Distilling Safe LLM Systems via Soft Prompts for On Device Settings 提出软提示蒸馏方法以解决边缘设备安全LLM部署问题 distillation large language model
14 Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short 提出Reasoning Arena以解决可验证奖励不足的问题 reinforcement learning large language model
15 Heterophily-Aware Adaptive Knowledge Distillation for Hypergraph Neural Networks 提出HADES以解决超图神经网络中的异质性问题 distillation
16 Safe-RULE: Safe Reinforcement UnLEarning 提出Safe-RULE以解决离线安全强化学习中的数据中毒问题 reinforcement learning policy learning
17 Stage-1 Controls the Entropy Regime, Not the Outcome 研究Stage-1对熵状态的影响而非结果的控制 reinforcement learning distillation
18 Zero Touch Predictive Orchestration: Automating Time-Series Models for the Cloud-Edge Continuum 提出自动化时间序列预测架构以解决云边缘计算的冷启动问题 predictive model MAE
19 Counterfactual Transport Flows for Offline Conservative Trajectory Refinement 提出反事实传输流以解决离线强化学习中的轨迹优化问题 reinforcement learning offline reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
20 Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model 提出Topo-Omni模型以解决脑区功能选择性研究问题 foundation model multimodal
21 BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference 提出BUDDY框架以解决大语言模型推理中的预算控制问题 large language model
22 Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models 提出实证隐私保护基准以优化大语言模型适应性 large language model
23 LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models 提出LargeMonitor以解决在线无任务持续学习中的数据漂移问题 foundation model multimodal
24 Tight Sample Complexity of Transformers 紧密表征变压器的样本复杂度以优化学习效率 chain-of-thought
25 Muon Learns More Robust and Transferable Features than Adam 提出Muon优化器以提升特征学习的鲁棒性与可迁移性 large language model
26 Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs 提出针对LLM隐蔽信息提取的检测方法以应对现有防御不足 large language model
27 PRISM: Topology-Aware Cross-Modal Imputation for Modality-Deficient Federated Graph Learning 提出PRISM以解决多模态联邦图学习中的模态缺失问题 multimodal
28 Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation 提出SAR以解决几何生成中的学习信号丢失问题 large language model
29 Beyond FLOPs: Benchmarking Real Inference Acceleration of LLM Pruning under a GEMM-Centric Taxonomy 提出GEMM中心分类法以优化大语言模型剪枝加速 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
30 BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling 提出BrainSurgery以解决深度学习模型权重管理难题 manipulation
31 What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks 提出人类感知驱动的对抗文本攻击以提升内容审核系统的有效性 manipulation large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
32 A Universal Dense Football Event Representation Based on TabTransformer 提出基于TabTransformer的通用足球事件密集表示方法以提升分析精度 spatiotemporal
33 Intention Driven Identification of In-Possession Match Phases in Association Football through Temporal Graph Learning 提出基于时序图学习的框架以识别足球比赛中的持球阶段 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页