cs.LG（2025-12-29）

📊 共 23 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一：机器人控制 (Robot Control) (2) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Max-Entropy Reinforcement Learning with Flow Matching and A Case Study on LQR	提出基于Flow Matching的最大熵强化学习算法，提升策略表达能力与鲁棒性	reinforcement learning SAC flow matching
2	Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL	Splitwise：基于Lyapunov优化的DRL实现LLM在边缘-云协同推理的自适应切分。	reinforcement learning deep reinforcement learning DRL
3	Stochastic Siamese MAE Pretraining for Longitudinal Medical Images	提出STAMP：一种用于纵向医学图像的随机Siamese MAE预训练框架	representation learning MAE foundation model
4	MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling	提出MS-SSM，一种用于高效序列建模的多尺度状态空间模型	SSM state space model
5	Bellman Calibration for V-Learning in Offline Reinforcement Learning	提出迭代贝尔曼校准方法，用于离线强化学习中V函数预测的校准	reinforcement learning offline reinforcement learning
6	Joint Link Adaptation and Device Scheduling Approach for URLLC Industrial IoT Network: A DRL-based Method with Bayesian Optimization	针对URLLC工业物联网，提出基于贝叶斯优化的DRL联合链路自适应与设备调度方法	DRL TD3
7	Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance	提出DIR方法，通过信息论优化消除奖励模型中的归纳偏置，提升RLHF性能。	reinforcement learning RLHF large language model	✅
8	On the Inverse Flow Matching Problem in the One-Dimensional and Gaussian Cases	研究一维和高斯分布下的逆流匹配问题，为流匹配模型蒸馏提供理论基础	flow matching distillation
9	Diffusion-based Decentralized Federated Multi-Task Representation Learning	提出基于扩散的去中心化联邦多任务表征学习算法，解决数据稀缺环境下的特征提取问题。	representation learning
10	Efficient Deep Learning for Short-Term Solar Irradiance Time Series Forecasting: A Benchmark Study in Ho Chi Minh City	针对短时太阳辐照度预测，论文提出Transformer模型并结合知识蒸馏实现高效部署。	Mamba MAE distillation
11	Flow Matching Neural Processes	提出基于Flow Matching的神经过程模型，提升条件分布采样效率与精度。	flow matching
12	SB-TRPO: Towards Safe Reinforcement Learning with Hard Constraints	SB-TRPO：面向硬约束安全强化学习，动态平衡成本降低与奖励提升	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
13	The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models	提出多模型协作定律，揭示大语言模型集成性能的缩放极限	large language model
14	Post-Training Quantization of OpenPangu Models for Efficient Deployment on Atlas A2	针对昇腾A2，提出OpenPangu模型后训练量化方案，实现高效部署。	large language model chain-of-thought
15	BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization	BOAD：通过Bandit优化发现分层软件工程Agent，提升复杂任务泛化性	large language model	✅
16	VL-RouterBench: A Benchmark for Vision-Language Model Routing	VL-RouterBench：用于评估视觉-语言模型路由的系统性、可复现的基准测试。	multimodal
17	Trustworthy Machine Learning under Distribution Shifts	针对分布偏移下的可信机器学习，研究鲁棒性、可解释性和适应性	large language model
18	Discrete Semantic States and Hamiltonian Dynamics in LLM Embedding Spaces	利用哈密顿形式主义分析LLM嵌入空间，探索离散语义状态	large language model
19	FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence	FRoD：利用旋转自由度实现全秩高效微调，加速模型收敛	foundation model
20	Theoretical Foundations of Scaling Law in Familial Models	扩展Scaling Law，针对Familial模型引入粒度变量，实现“一次训练，多次部署”。	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
21	Multi-Agent Framework for Threat Mitigation and Resilience in AI-Based Systems	提出多智能体框架，增强人工智能系统的威胁缓解和韧性。	manipulation foundation model multimodal
22	Quantum Intelligence Meets BD-RIS-Enabled AmBC: Challenges, Opportunities, and Practical Insights	探索量子智能与BD-RIS-AmBC融合，应对挑战并提升6G通信性能	manipulation

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	SE-MLP Model for Predicting Prior Acceleration Features in Penetration Signals	提出SE-MLP模型，用于快速预测侵彻信号中的先验加速度特征，解决传统方法计算耗时问题。	penetration PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页