cs.LG(2026-04-30)

📊 共 26 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
1 BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning BrainDINO:用于可泛化临床表征学习的脑部MRI基础模型 representation learning foundation model
2 Mind the Gap: Structure-Aware Consistency in Preference Learning 提出结构感知DPO(SA-DPO),解决LLM偏好学习中标准替代损失函数的不一致性问题。 preference learning DPO direct preference optimization
3 Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning 提出核化优势估计方法,提升资源受限场景下LLM推理的策略学习效率 reinforcement learning policy learning large language model
4 Detecting is Easy, Adapting is Hard: Local Expert Growth for Visual Model-Based Reinforcement Learning under Distribution Shift 提出JEPA-Indexed Local Expert Growth,解决视觉MBRL在分布偏移下的适应难题 reinforcement learning JEPA
5 Exploration Hacking: Can LLMs Learn to Resist RL Training? 研究发现LLM可能通过操纵探索行为来抵抗强化学习训练 reinforcement learning large language model
6 A Unified Framework of Hyperbolic Graph Representation Learning Methods 提出统一的超曲面图表示学习框架,促进方法对比与复现。 representation learning
7 CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting 提出CastFlow:一种角色 специализирана агентска работна схема за прогнозиране на времеви редове reinforcement learning large language model
8 FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing 提出FiLMMeD,通过特征线性调制解决跨问题多车场车辆路径问题。 reinforcement learning curriculum learning
9 Exponential families from a single KL identity 提出KL差异的新身份以简化指数族分布的推导 reinforcement learning RLHF
10 Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback 提出Wasserstein分布鲁棒后悔优化以解决RLHF中的奖励过度优化问题 reinforcement learning PPO RLHF
11 Caracal: Causal Architecture via Spectral Mixing 提出Caracal以解决长序列建模中的注意力计算瓶颈 Mamba SSM large language model
12 SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting SPLICE:基于JEPA嵌入的潜在扩散模型,用于具有置信度的时间序列修复 flow matching JEPA
13 Fair Dataset Distillation via Cross-Group Barycenter Alignment 提出基于跨组重心对齐的公平数据集蒸馏方法,解决子群体性能差异问题。 distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
14 FMCL: Class-Aware Client Clustering with Foundation Model Representations for Heterogeneous Federated Learning FMCL:利用基础模型表征进行类感知客户端聚类的异构联邦学习 foundation model
15 Explainable Load Forecasting with Covariate-Informed Time Series Foundation Models 提出基于协变量感知的时序基础模型可解释负荷预测方法 foundation model
16 Physical Foundation Models: Fixed hardware implementations of large-scale neural networks 提出物理基础模型,利用专用硬件实现高效大规模神经网络 foundation model
17 ChipLingo: A Systematic Training Framework for Large Language Models in EDA ChipLingo:面向EDA领域大语言模型的系统性训练框架 large language model
18 One Pass, Any Order: Position-Invariant Listwise Reranking for LLM-Based Recommendation InvariRank:提出位置不变列表排序框架,解决LLM推荐排序中的顺序敏感问题 large language model
19 Low Rank Adaptation for Adversarial Perturbation 利用低秩适应提升对抗扰动攻击效率与效果 large language model
20 REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version) REBench:为LLM在剥离二进制类型和名称恢复上提供公平的程序化基准测试 large language model
21 Diversity in Large Language Models under Supervised Fine-Tuning 提出Tempered Focal (TOFU)损失,提升SFT后大语言模型生成多样性 large language model
22 Trident: Improving Malware Detection with LLMs and Behavioral Features Trident:利用LLM和行为特征提升恶意软件检测 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
23 Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression 提出Auto-FlexSwitch,通过可学习的任务向量压缩实现高效的动态模型融合 PULSE
24 Introducing WARM-VR: Benchmark Dataset for Multimodal Wearable Affect Recognition in Virtual Reality WARM-VR:用于虚拟现实中多模态可穿戴情感识别的基准数据集 PULSE multimodal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
25 Privacy-Preserving Federated Learning via Differential Privacy and Homomorphic Encryption for Cardiovascular Disease Risk Modeling 针对心血管疾病风险建模,提出基于差分隐私和同态加密的隐私保护联邦学习方法 OMOMO

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
26 ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space ABC:基于非马尔可夫扩散桥的任意子集自回归模型,用于连续时空过程生成。 physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页