cs.LG(2025-12-29)

📊 共 23 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 Max-Entropy Reinforcement Learning with Flow Matching and A Case Study on LQR 提出基于Flow Matching的最大熵强化学习算法,提升策略表达能力与鲁棒性 reinforcement learning SAC flow matching
2 Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL Splitwise:基于Lyapunov优化的DRL实现LLM在边缘-云协同推理的自适应切分。 reinforcement learning deep reinforcement learning DRL
3 Stochastic Siamese MAE Pretraining for Longitudinal Medical Images 提出STAMP:一种用于纵向医学图像的随机Siamese MAE预训练框架 representation learning MAE foundation model
4 MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling 提出MS-SSM,一种用于高效序列建模的多尺度状态空间模型 SSM state space model
5 Bellman Calibration for V-Learning in Offline Reinforcement Learning 提出迭代贝尔曼校准方法,用于离线强化学习中V函数预测的校准 reinforcement learning offline reinforcement learning
6 Joint Link Adaptation and Device Scheduling Approach for URLLC Industrial IoT Network: A DRL-based Method with Bayesian Optimization 针对URLLC工业物联网,提出基于贝叶斯优化的DRL联合链路自适应与设备调度方法 DRL TD3
7 Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance 提出DIR方法,通过信息论优化消除奖励模型中的归纳偏置,提升RLHF性能。 reinforcement learning RLHF large language model
8 On the Inverse Flow Matching Problem in the One-Dimensional and Gaussian Cases 研究一维和高斯分布下的逆流匹配问题,为流匹配模型蒸馏提供理论基础 flow matching distillation
9 Diffusion-based Decentralized Federated Multi-Task Representation Learning 提出基于扩散的去中心化联邦多任务表征学习算法,解决数据稀缺环境下的特征提取问题。 representation learning
10 Efficient Deep Learning for Short-Term Solar Irradiance Time Series Forecasting: A Benchmark Study in Ho Chi Minh City 针对短时太阳辐照度预测,论文提出Transformer模型并结合知识蒸馏实现高效部署。 Mamba MAE distillation
11 Flow Matching Neural Processes 提出基于Flow Matching的神经过程模型,提升条件分布采样效率与精度。 flow matching
12 SB-TRPO: Towards Safe Reinforcement Learning with Hard Constraints SB-TRPO:面向硬约束安全强化学习,动态平衡成本降低与奖励提升 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
13 The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models 提出多模型协作定律,揭示大语言模型集成性能的缩放极限 large language model
14 Post-Training Quantization of OpenPangu Models for Efficient Deployment on Atlas A2 针对昇腾A2,提出OpenPangu模型后训练量化方案,实现高效部署。 large language model chain-of-thought
15 BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization BOAD:通过Bandit优化发现分层软件工程Agent,提升复杂任务泛化性 large language model
16 VL-RouterBench: A Benchmark for Vision-Language Model Routing VL-RouterBench:用于评估视觉-语言模型路由的系统性、可复现的基准测试。 multimodal
17 Trustworthy Machine Learning under Distribution Shifts 针对分布偏移下的可信机器学习,研究鲁棒性、可解释性和适应性 large language model
18 Discrete Semantic States and Hamiltonian Dynamics in LLM Embedding Spaces 利用哈密顿形式主义分析LLM嵌入空间,探索离散语义状态 large language model
19 FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence FRoD:利用旋转自由度实现全秩高效微调,加速模型收敛 foundation model
20 Theoretical Foundations of Scaling Law in Familial Models 扩展Scaling Law,针对Familial模型引入粒度变量,实现“一次训练,多次部署”。 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
21 Multi-Agent Framework for Threat Mitigation and Resilience in AI-Based Systems 提出多智能体框架,增强人工智能系统的威胁缓解和韧性。 manipulation foundation model multimodal
22 Quantum Intelligence Meets BD-RIS-Enabled AmBC: Challenges, Opportunities, and Practical Insights 探索量子智能与BD-RIS-AmBC融合,应对挑战并提升6G通信性能 manipulation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
23 SE-MLP Model for Predicting Prior Acceleration Features in Penetration Signals 提出SE-MLP模型,用于快速预测侵彻信号中的先验加速度特征,解决传统方法计算耗时问题。 penetration PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页