cs.LG（2025-12-19）

📊 共 9 篇论文

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (4) 支柱八：物理动画 (Physics-based Animation) (3) 支柱九：具身大模型 (Embodied Foundation Models) (2)

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Trust-Region Adaptive Policy Optimization	提出TRAPO，交错SFT与RL优化LLM推理能力，显著提升数学推理性能。	reinforcement learning large language model
2	Assessing Long-Term Electricity Market Design for Ambitious Decarbonization Targets using Multi-Agent Reinforcement Learning	提出基于多智能体强化学习的电力市场长期设计评估框架，助力实现深度脱碳目标。	reinforcement learning
3	AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens	AdvJudge-Zero：通过对抗控制令牌翻转LLM评判器的二元决策	RLHF DPO
4	A Theoretical Analysis of State Similarity Between Markov Decision Processes	提出广义双模拟度量GBSM，用于评估马尔可夫决策过程间的状态相似性。	reinforcement learning representation learning

🔬 支柱八：物理动画 (Physics-based Animation) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
5	MINPO: Memory-Informed Neural Pseudo-Operator to Resolve Nonlocal Spatiotemporal Dynamics	提出MINPO，利用记忆信息神经伪算子解决非局部时空动力学问题	spatiotemporal
6	Perfect reconstruction of sparse signals using nonconvexity control and one-step RSB message passing	提出基于非凸性控制和一步RSB消息传递的稀疏信号完美重构方法	AMP
7	Learning solution operator of dynamical systems with diffusion maps kernel ridge regression	提出基于扩散映射核岭回归(DM-KRR)的动力系统解算子学习方法，提升长期预测精度。	spatiotemporal

🔬 支柱九：具身大模型 (Embodied Foundation Models) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
8	Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing	提出FlashCodec和UnifiedServe，通过GPU内调度和资源共享加速多阶段MLLM推理。	large language model multimodal
9	Weighted Stochastic Differential Equation to Implement Wasserstein-Fisher-Rao Gradient Flow	提出基于加权随机微分方程的Wasserstein-Fisher-Rao梯度流方法，提升生成模型采样效率。	multimodal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页