cs.LG(2026-03-23)

📊 共 27 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (15 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (10) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (15 篇)

#题目一句话要点标签🔗
1 SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models 提出SSAM,通过奇异子空间对齐实现多模态大语言模型的无训练融合 large language model multimodal
2 Riemannian Geometry Speaks Louder Than Words: From Graph Foundation Model to Next-Generation Graph Intelligence 提出Riemannian Foundation Model (RFM),利用黎曼几何构建下一代图智能。 large language model foundation model
3 AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing 提出AdditiveLLM2,一种面向增材制造的多模态大语言模型,通过领域自适应预训练实现专业化。 large language model
4 Extending Precipitation Nowcasting Horizons via Spectral Fusion of Radar Observations and Foundation Model Priors PW-FouCast:通过雷达观测与气象大模型先验的频谱融合,扩展降水临近预报时效 foundation model
5 Towards Multimodal Time Series Anomaly Detection with Semantic Alignment and Condensed Interaction 提出MindTS模型,通过语义对齐和精简交互实现多模态时间序列异常检测。 multimodal
6 ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention 提出ROM,通过流式检测和干预缓解大型推理模型中的过度思考问题。 large language model chain-of-thought
7 Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs 提出基于Fisher信息的层自适应模型融合方法,提升长链推理LLM性能。 large language model chain-of-thought
8 Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting 提出噪声滴定方法以解决时间序列预测的评估问题 foundation model
9 SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection SPA:一种简单但效果极佳的知识注入基线方法 large language model
10 Revisiting Quantum Code Generation: Where Should Domain Knowledge Live? 通过推理时增强,提升LLM在量子代码生成中的性能,无需领域微调。 large language model
11 Causal Evidence that Language Models use Confidence to Drive Behavior 揭示大语言模型利用置信度驱动行为决策,为自主智能体发展奠定基础 large language model
12 Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization 提出MoE架构优化框架,通过联合约束和降维搜索,实现任意计算预算下的最优架构配置。 large language model
13 Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization 提出深度递归Transformer,解决Transformer在组合泛化中计算深度受限问题 chain-of-thought
14 Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy 提出基于柯尔莫哥洛夫复杂度的LLM隐写术理论界限及基于困惑度的检测代理。 large language model
15 Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equations ICONs模型扩展至高阶偏微分方程,保持解的动态特性 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
16 Multimodal Survival Analysis with Locally Deployable Large Language Models 提出基于本地部署LLM的多模态生存分析方法,提升预测精度和隐私性。 teacher-student distillation large language model
17 CoRA: Boosting Time Series Foundation Models for Multivariate Forecasting through Correlation-aware Adapter 提出CoRA:一种相关性感知适配器,提升时间序列基础模型的多变量预测能力 contrastive learning foundation model
18 Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe 针对长程工具使用Agent,提出基于强化学习的系统性方案,显著提升TravelPlanner性能。 reinforcement learning reward shaping large language model
19 Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors 揭示深度强化学习中两种时序差分误差解释的差异性及其对算法性能的影响 reinforcement learning deep reinforcement learning
20 Cluster-Specific Predictive Modeling: A Scalable Solution for Resource-Constrained Wi-Fi Controllers 提出集群特定预测建模以解决资源受限Wi-Fi控制器问题 predictive model MAE
21 What Do World Models Learn in RL? Probing Latent Representations in Learned Environment Simulators 通过可解释性分析,揭示强化学习中世界模型对环境状态的线性表征学习 reinforcement learning world model
22 On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation 提出基于更新方向的RLVR方法以提升大语言模型推理能力 reinforcement learning large language model
23 TREX: Trajectory Explanations for Multi-Objective Reinforcement Learning TREX:基于轨迹归因的多目标强化学习可解释性框架 reinforcement learning
24 P^2O: Joint Policy and Prompt Optimization 提出P^2O框架,联合优化策略与提示,提升LLM在困难样本上的推理能力。 reinforcement learning large language model
25 Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective 提出GSB-PPO,一种基于广义薛定谔桥的路径空间近端策略优化方法,用于训练生成策略。 reinforcement learning PPO

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
26 Model Predictive Control with Differentiable World Models for Offline Reinforcement Learning 提出基于可微世界模型的模型预测控制,用于离线强化学习。 locomotion MPC model predictive control
27 Decoupling Exploration and Policy Optimization: Uncertainty Guided Tree Search for Hard Exploration 提出解耦探索与策略优化的不确定性引导树搜索算法,用于解决强化学习中的困难探索问题 manipulation dexterous manipulation reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页