cs.LG（2026-06-08）

📊 共 33 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (19 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (10 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (2)

🔬 支柱二：RL算法与架构 (RL & Architecture) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	C$^3$ache: Accelerating World Action Models with Cross Inference Chunk Cache	提出C$^3$ache以加速世界动作模型推理	world action model world action models vision-language-action
2	Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search	提出世界模型启发的评估器以优化张量程序搜索	world model world models latent dynamics
3	From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model	提出Cox监督蒸馏方法将生存风险转化为语言模型	distillation large language model
4	From Shortcuts to Reasoning: Robust Post-Training of Theory of Mind with Reinforcement Learning	提出Thinking-RFT以解决ToM模型中的快捷方式问题	reinforcement learning foundation model multimodal
5	Stabilizing On-Policy Distillation for MLLM Reasoning with Global Normalization	提出全球归一化蒸馏策略优化以解决梯度不稳定问题	reinforcement learning distillation multimodal	✅
6	Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families	提出跨模型系列的在政策蒸馏方法以解决tokenizer限制问题	teacher-student distillation large language model
7	PBSD: Privileged Bayesian Self-Distillation for Long-Horizon Credit Assignment	提出PBSD以解决长时间信用分配问题	reinforcement learning policy learning distillation
8	Rethinking the Divergence Regularization in LLM RL	提出DRPO以解决LLM RL中的信任区域优化问题	reinforcement learning PPO large language model
9	Addressing Market Regime Changes and Heavy-Tailed Returns in Portfolio Optimization via Bayesian VAR and Elliptical Black-Litterman	提出BAVAR-BLED算法以解决投资组合优化中的市场状态变化与重尾收益问题	reinforcement learning deep reinforcement learning DRL
10	A Unifying Lens on Reward Uncertainty in RLHF	提出分布式奖励模型以缓解RLHF中的奖励黑客问题	reinforcement learning RLHF
11	Escaping the KL Agreement Trap in On-Policy Distillation	提出KAT以解决在线策略蒸馏中的低KL一致性陷阱问题	distillation
12	Graph Mamba Operator: A Latent Simulator for Interacting Particle Systems	提出Graph Mamba Operator以解决粒子系统建模问题	Mamba
13	Distilling Safe LLM Systems via Soft Prompts for On Device Settings	提出软提示蒸馏方法以解决边缘设备安全LLM部署问题	distillation large language model
14	Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short	提出Reasoning Arena以解决可验证奖励不足的问题	reinforcement learning large language model
15	Heterophily-Aware Adaptive Knowledge Distillation for Hypergraph Neural Networks	提出HADES以解决超图神经网络中的异质性问题	distillation
16	Safe-RULE: Safe Reinforcement UnLEarning	提出Safe-RULE以解决离线安全强化学习中的数据中毒问题	reinforcement learning policy learning
17	Stage-1 Controls the Entropy Regime, Not the Outcome	研究Stage-1对熵状态的影响而非结果的控制	reinforcement learning distillation
18	Zero Touch Predictive Orchestration: Automating Time-Series Models for the Cloud-Edge Continuum	提出自动化时间序列预测架构以解决云边缘计算的冷启动问题	predictive model MAE
19	Counterfactual Transport Flows for Offline Conservative Trajectory Refinement	提出反事实传输流以解决离线强化学习中的轨迹优化问题	reinforcement learning offline reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model	提出Topo-Omni模型以解决脑区功能选择性研究问题	foundation model multimodal
21	BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference	提出BUDDY框架以解决大语言模型推理中的预算控制问题	large language model
22	Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models	提出实证隐私保护基准以优化大语言模型适应性	large language model
23	LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models	提出LargeMonitor以解决在线无任务持续学习中的数据漂移问题	foundation model multimodal
24	Tight Sample Complexity of Transformers	紧密表征变压器的样本复杂度以优化学习效率	chain-of-thought
25	Muon Learns More Robust and Transferable Features than Adam	提出Muon优化器以提升特征学习的鲁棒性与可迁移性	large language model
26	Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs	提出针对LLM隐蔽信息提取的检测方法以应对现有防御不足	large language model
27	PRISM: Topology-Aware Cross-Modal Imputation for Modality-Deficient Federated Graph Learning	提出PRISM以解决多模态联邦图学习中的模态缺失问题	multimodal
28	Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation	提出SAR以解决几何生成中的学习信号丢失问题	large language model	✅
29	Beyond FLOPs: Benchmarking Real Inference Acceleration of LLM Pruning under a GEMM-Centric Taxonomy	提出GEMM中心分类法以优化大语言模型剪枝加速	large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
30	BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling	提出BrainSurgery以解决深度学习模型权重管理难题	manipulation
31	What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks	提出人类感知驱动的对抗文本攻击以提升内容审核系统的有效性	manipulation large language model

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
32	A Universal Dense Football Event Representation Based on TabTransformer	提出基于TabTransformer的通用足球事件密集表示方法以提升分析精度	spatiotemporal
33	Intention Driven Identification of In-Possession Match Phases in Association Football through Temporal Graph Learning	提出基于时序图学习的框架以识别足球比赛中的持球阶段	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页