cs.LG（2026-04-30）

📊 共 26 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱八：物理动画 (Physics-based Animation) (2) 支柱五：交互与反应 (Interaction & Reaction) (1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
1	BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning	BrainDINO：用于可泛化临床表征学习的脑部MRI基础模型	representation learning foundation model
2	Mind the Gap: Structure-Aware Consistency in Preference Learning	提出结构感知DPO（SA-DPO），解决LLM偏好学习中标准替代损失函数的不一致性问题。	preference learning DPO direct preference optimization
3	Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning	提出核化优势估计方法，提升资源受限场景下LLM推理的策略学习效率	reinforcement learning policy learning large language model
4	Detecting is Easy, Adapting is Hard: Local Expert Growth for Visual Model-Based Reinforcement Learning under Distribution Shift	提出JEPA-Indexed Local Expert Growth，解决视觉MBRL在分布偏移下的适应难题	reinforcement learning JEPA
5	Exploration Hacking: Can LLMs Learn to Resist RL Training?	研究发现LLM可能通过操纵探索行为来抵抗强化学习训练	reinforcement learning large language model
6	A Unified Framework of Hyperbolic Graph Representation Learning Methods	提出统一的超曲面图表示学习框架，促进方法对比与复现。	representation learning
7	CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting	提出CastFlow：一种角色 специализирана агентска работна схема за прогнозиране на времеви редове	reinforcement learning large language model
8	FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing	提出FiLMMeD，通过特征线性调制解决跨问题多车场车辆路径问题。	reinforcement learning curriculum learning	✅
9	Exponential families from a single KL identity	提出KL差异的新身份以简化指数族分布的推导	reinforcement learning RLHF
10	Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback	提出Wasserstein分布鲁棒后悔优化以解决RLHF中的奖励过度优化问题	reinforcement learning PPO RLHF
11	Caracal: Causal Architecture via Spectral Mixing	提出Caracal以解决长序列建模中的注意力计算瓶颈	Mamba SSM large language model
12	SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting	SPLICE：基于JEPA嵌入的潜在扩散模型，用于具有置信度的时间序列修复	flow matching JEPA
13	Fair Dataset Distillation via Cross-Group Barycenter Alignment	提出基于跨组重心对齐的公平数据集蒸馏方法，解决子群体性能差异问题。	distillation

🔬 支柱九：具身大模型 (Embodied Foundation Models) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
14	FMCL: Class-Aware Client Clustering with Foundation Model Representations for Heterogeneous Federated Learning	FMCL：利用基础模型表征进行类感知客户端聚类的异构联邦学习	foundation model
15	Explainable Load Forecasting with Covariate-Informed Time Series Foundation Models	提出基于协变量感知的时序基础模型可解释负荷预测方法	foundation model
16	Physical Foundation Models: Fixed hardware implementations of large-scale neural networks	提出物理基础模型，利用专用硬件实现高效大规模神经网络	foundation model
17	ChipLingo: A Systematic Training Framework for Large Language Models in EDA	ChipLingo：面向EDA领域大语言模型的系统性训练框架	large language model
18	One Pass, Any Order: Position-Invariant Listwise Reranking for LLM-Based Recommendation	InvariRank：提出位置不变列表排序框架，解决LLM推荐排序中的顺序敏感问题	large language model	✅
19	Low Rank Adaptation for Adversarial Perturbation	利用低秩适应提升对抗扰动攻击效率与效果	large language model
20	REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version)	REBench：为LLM在剥离二进制类型和名称恢复上提供公平的程序化基准测试	large language model
21	Diversity in Large Language Models under Supervised Fine-Tuning	提出Tempered Focal (TOFU)损失，提升SFT后大语言模型生成多样性	large language model
22	Trident: Improving Malware Detection with LLMs and Behavioral Features	Trident：利用LLM和行为特征提升恶意软件检测	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression	提出Auto-FlexSwitch，通过可学习的任务向量压缩实现高效的动态模型融合	PULSE
24	Introducing WARM-VR: Benchmark Dataset for Multimodal Wearable Affect Recognition in Virtual Reality	WARM-VR：用于虚拟现实中多模态可穿戴情感识别的基准数据集	PULSE multimodal

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Privacy-Preserving Federated Learning via Differential Privacy and Homomorphic Encryption for Cardiovascular Disease Risk Modeling	针对心血管疾病风险建模，提出基于差分隐私和同态加密的隐私保护联邦学习方法	OMOMO

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
26	ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space	ABC：基于非马尔可夫扩散桥的任意子集自回归模型，用于连续时空过程生成。	physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页