cs.LG(2026-03-24)

📊 共 15 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models 提出Caterpillar of Thoughts (CaT),优化大语言模型测试时计算,提升效率。 large language model chain-of-thought
2 Assessing the Robustness of Climate Foundation Models under No-Analog Distribution Shifts 评估气候基础模型在非相似分布偏移下的鲁棒性 foundation model
3 Can Graph Foundation Models Generalize Over Architecture? 提出自适应图算子混合框架,提升图基础模型在异构架构任务上的泛化能力 foundation model
4 TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration TreeTeaming:通过分层策略探索实现视觉-语言模型的自主红队测试 large language model multimodal
5 Sparser, Faster, Lighter Transformer Language Models 提出稀疏Transformer语言模型,提升推理和训练效率并降低资源消耗 large language model foundation model
6 Post-Selection Distributional Model Evaluation 提出PS-DME框架,用于模型预选后对KPI分布进行可靠评估,解决后选择偏差问题。 large language model
7 Multitask-Informed Prior for In-Context Learning on Tabular Data: Application to Steel Property Prediction 提出多任务学习先验的TabPFN,用于钢材性能预测的上下文学习。 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
8 Off-Policy Value-Based Reinforcement Learning for Large Language Models 提出ReVal:一种面向大语言模型的Off-Policy价值强化学习方法,提升数据利用率。 reinforcement learning policy learning large language model
9 GEM: Guided Expectation-Maximization for Behavior-Normalized Candidate Action Selection in Offline RL GEM:离线强化学习中基于引导期望最大化的行为归一化候选动作选择 reinforcement learning offline RL offline reinforcement learning
10 SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling SortedRL通过在线长度感知调度加速LLM的强化学习训练 reinforcement learning large language model chain-of-thought
11 Neural ODE and SDE Models for Adaptation and Planning in Model-Based Reinforcement Learning 提出基于神经ODE和SDE的模型,用于模型强化学习中的适应和规划,提升样本效率。 reinforcement learning
12 Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints 提出Dual Q-DM,一种无对抗模仿学习方法,理论保证消除复合误差。 imitation learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
13 VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents VLGOR:视觉-语言知识引导的离线强化学习,提升通用智能体性能 manipulation reinforcement learning offline RL
14 Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions 提出Byz-Clip21-SGD2M算法,解决拜占庭鲁棒和差分隐私联邦优化问题 manipulation

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
15 TorR: Towards Brain-Inspired Task-Oriented Reasoning via Cache-Oriented Algorithm-Architecture Co-design TorR:面向边缘端实时目标检测的脑启发式算法-架构协同设计 open-vocabulary open vocabulary

⬅️ 返回 cs.LG 首页 · 🏠 返回主页