cs.LG(2026-02-27)

📊 共 27 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (12 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
1 MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening MINT:用于阿尔茨海默病早期筛查的多模态影像-语音知识迁移 multimodal
2 Time Series Foundation Models as Strong Baselines in Transportation Forecasting: A Large-Scale Benchmark Analysis 利用时间序列基础模型Chronos-2,实现交通预测任务的强大零样本基线。 foundation model
3 ULW-SleepNet: An Ultra-Lightweight Network for Multimodal Sleep Stage Scoring 提出ULW-SleepNet超轻量级网络,用于多模态睡眠分期评分。 multimodal
4 MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models 提出MPU框架,解决大语言模型知识遗忘中的隐私保护难题 large language model
5 TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure TradeFM:用于交易流和市场微观结构的生成式基础模型 foundation model
6 When Does Multimodal Learning Help in Healthcare? A Benchmark on EHR and Chest X-Ray Fusion CareBench:系统性评估EHR与胸部X光融合在医疗场景下的有效性、鲁棒性与公平性。 multimodal
7 Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation LoRA-Pre:通过低秩近似优化器状态,提升大模型预训练和微调效率 large language model
8 The Subjectivity of Monoculture 重新审视单一同质性:模型一致性评估的主观性与情境依赖性 large language model
9 RewardUQ: A Unified Framework for Uncertainty-Aware Reward Models RewardUQ:用于奖励模型不确定性量化的统一框架 large language model
10 LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding 提出LK损失函数,直接优化推测解码中的接受率,提升LLM推理速度。 large language model
11 FedRot-LoRA: Mitigating Rotational Misalignment in Federated LoRA 提出FedRot-LoRA以解决联邦LoRA中的旋转不对齐问题 large language model
12 VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees VaSST:基于变分推断和软符号树的符号回归方法 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
13 MAGE: Multi-scale Autoregressive Generation for Offline Reinforcement Learning MAGE:多尺度自回归生成离线强化学习方法,解决长时程稀疏奖励任务 reinforcement learning offline RL offline reinforcement learning
14 Multi-Objective Reinforcement Learning for Large-Scale Tote Allocation in Human-Robot Collaborative Fulfillment Centers 提出基于多目标强化学习的大规模货位分配方法,优化人机协作物流中心。 reinforcement learning policy learning
15 Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments 提出面向开放世界的可信自适应智能体基础世界模型 reinforcement learning world model
16 Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies 提出基于参数化策略的离线策略优化方法,扩展至大动作空间 reinforcement learning offline RL offline reinforcement learning
17 Bridging Dynamics Gaps via Diffusion Schrödinger Bridge for Cross-Domain Reinforcement Learning 提出基于扩散Schrödinger桥的BDGxRL,解决跨域强化学习中的动态差异问题 reinforcement learning policy learning
18 Disentangled Mode-Specific Representations for Tensor Time Series via Contrastive Learning 提出MoST,通过对比学习解耦张量时间序列的模态特定表示,提升分类与预测精度。 representation learning contrastive learning
19 CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation 提出CUDA Agent,通过大规模Agent强化学习生成高性能CUDA内核。 reinforcement learning large language model
20 Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning 提出自适应相关性加权内在奖励(ACWI)框架,提升稀疏奖励强化学习的探索效率。 reinforcement learning
21 General Bayesian Policy Learning 提出通用贝叶斯策略学习框架,解决决策问题中的策略优化问题 policy learning
22 Flowette: Flow Matching with Graphette Priors for Graph Generation Flowette:结合Graphette先验的Flow Matching图生成模型 flow matching
23 InfoNCE Induces Gaussian Distribution 证明InfoNCE损失诱导对比学习表征呈高斯分布特性 representation learning contrastive learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
24 Actor-Critic Pretraining for Proximal Policy Optimization 提出Actor-Critic预训练方法,提升PPO在机器人控制中的样本效率 locomotion manipulation reinforcement learning
25 OPTIAGENT: A Physics-Driven Agentic Framework for Automated Optical Design OPTIAGENT:基于物理驱动的Agent框架,实现自动化光学设计 manipulation large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
26 BLISSNet: Deep Operator Learning for Fast and Accurate Flow Reconstruction from Sparse Sensor Measurements 提出BLISSNet,用于从稀疏传感器数据中快速准确地重建流场。 sparse sensors

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
27 Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference 提出基于变分推断的学习框架,优化Masked离散扩散模型的并行生成顺序 MDM

⬅️ 返回 cs.LG 首页 · 🏠 返回主页