cs.LG(2026-04-28)

📊 共 19 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Prior-Aligned Data Cleaning for Tabular Foundation Models 提出L2C2框架,通过强化学习进行先验对齐的数据清洗,提升表格基础模型性能。 reinforcement learning reward design foundation model
2 DGLight: DQN-Guided GRPO Fine-Tuning of Large Language Models for Traffic Signal Control 提出DGLight以优化交通信号控制中的大语言模型 reinforcement learning large language model
3 Conditional Flow Matching for Probabilistic Downscaling of Maximum 3-day Snowfall in Alaska 提出WxFlow,基于条件流匹配实现阿拉斯加最大3日降雪概率降尺度,提升光谱保真度。 flow matching physically plausible
4 Diverse Image Priors for Black-box Data-free Knowledge Distillation 提出DIP-KD,解决黑盒无数据知识蒸馏中数据多样性不足的问题 contrastive learning distillation
5 Biased Dreams: Limitations to Epistemic Uncertainty Quantification in Latent Space Models 揭示潜在空间模型中认知不确定性量化的局限性:存在偏差的“梦境” reinforcement learning dreamer latent dynamics
6 Zero Shot Coordination for Sparse Reward Tasks with Diverse Reward Shapings 提出基于随机奖励塑造集成的零样本协作方法,解决稀疏奖励任务中的合作问题。 reinforcement learning reward shaping
7 When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient 针对策略梯度,提出一种考虑有益误差的不完美奖励分类方法,应用于语言模型训练。 reinforcement learning RLHF reward design
8 Sustained Gradient Alignment Mediates Subliminal Learning in a Multi-Step Setting: Evidence from MNIST Auxiliary Logit Distillation Experiment 研究表明持续梯度对齐驱动MNIST辅助Logit蒸馏中的潜意识学习 distillation
9 Dyna-Style Safety Augmented Reinforcement Learning: Staying Safe in the Face of Uncertainty 提出Dyna-SAuR算法,通过学习动态模型和安全滤波器提升强化学习安全性 reinforcement learning
10 Knowledge Distillation Must Account for What It Loses 知识蒸馏需考虑信息损失,关注模型能力可靠性 distillation
11 Elite-Driven Support Vector Machines for Classification 提出Elite-Driven SVM,通过精英样本引导提升分类性能并融合先验知识。 teacher-student distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
12 VLM Judges Can Rank but Cannot Score: Task-Dependent Uncertainty in Multimodal Evaluation 提出基于共形预测的VLM评判不确定性分析方法,揭示任务依赖性并识别排序-评分解耦问题。 multimodal
13 Knowledge-Data Dually Driven Paradigm for Accurate Landslide Susceptibility Prediction under Data-Scarce Conditions Using Geomorphic Priors and Tabular Foundation Model 提出知识-数据双驱动范式,利用地貌先验和表格基础模型解决数据稀缺下的滑坡易发性预测问题 foundation model
14 Accurate and Robust Generative Approach for Overcoming Data Sparsity and Imbalance in Landslide Modeling with A Tabular Foundation Model 提出基于表格基础模型的生成方法,解决滑坡建模中数据稀疏和不平衡问题 foundation model
15 Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models 提出碳税Transformer(CTT)以压缩大型语言模型,降低计算成本和碳排放。 large language model
16 Barriers to Universal Reasoning With Transformers (And How to Overcome Them) 揭示Transformer在通用推理中长度泛化的局限性,并提出基于Signpost Token的解决方案 chain-of-thought

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
17 TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning 提出TSN-Affinity,通过相似性驱动的参数复用解决持续离线强化学习中的灾难性遗忘问题。 manipulation reinforcement learning offline reinforcement learning
18 Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM 提出高斯探测方法,无需生成即可评估模型在CSAM等有害领域的专业化能力。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
19 Adaptable phase retrieval for coherent transition radiation spectroscopy based on differentiable physics information 提出基于可微物理信息的相干渡越辐射谱可调相位恢复方法 AMP multimodal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页