cs.LG（2025-05-28）

📊 共 44 篇论文 | 🔗 7 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (22 🔗4) 支柱二：RL算法与架构 (RL & Architecture) (20 🔗3) 支柱八：物理动画 (Physics-based Animation) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (22 篇)

#	题目	一句话要点	标签	🔗
1	Defining Foundation Models for Computational Science: A Call for Clarity and Rigor	为计算科学定义基础模型：呼吁清晰性和严谨性	foundation model
2	On Learning Verifiers for Chain-of-Thought Reasoning	提出学习可信验证器框架，用于自然语言思维链推理的正确性验证。	chain-of-thought
3	DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models	DES-LOC：面向大规模模型训练的解耦低通信自适应优化器	foundation model
4	EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles	EnsemW2S：利用大语言模型集成提升弱到强泛化能力	large language model
5	SlimLLM: Accurate Structured Pruning for Large Language Models	SlimLLM：面向大语言模型的精确结构化剪枝方法	large language model
6	Revisiting Bayesian Model Averaging in the Era of Foundation Models	提出基于贝叶斯模型平均（BMA）和可优化模型平均（OMA）的集成方法，提升图像和文本分类任务性能。	foundation model
7	Investigating the effectiveness of multimodal data in forecasting SARS-COV-2 case surges	利用多模态数据预测SARS-COV-2病例激增，揭示国家和阶段异质性。	multimodal
8	NOCL: Node-Oriented Conceptualization LLM for Graph Tasks without Message Passing	提出NOCL，一种无需消息传递的面向节点概念化的大语言模型，用于图任务。	large language model foundation model
9	SimuGen: Multi-modal Agentic Framework for Constructing Block Diagram-Based Simulation Models	SimuGen：多模态Agent框架，用于构建基于框图的Simulink仿真模型	large language model multimodal	✅
10	Scalable Parameter and Memory Efficient Pretraining for LLM: Recent Algorithmic Advances and Benchmarking	针对LLM预训练，提出权重重构和动量重置技术，提升参数效率和降低内存需求。	large language model
11	BLUR: A Benchmark for LLM Unlearning Robust to Forget-Retain Overlap	BLUR：一个针对LLM非学习的基准测试，对遗忘-保留重叠具有鲁棒性	large language model	✅
12	Highly Efficient and Effective LLMs with Multi-Boolean Architectures	提出基于多核布尔架构的高效LLM微调方法，无需全精度潜在权重。	large language model
13	Navigating the Latent Space Dynamics of Neural Models	提出基于隐空间动力系统的神经网络分析方法，用于分析模型泛化能力和提取先验知识。	foundation model
14	Multivariate de Bruijn Graphs: A Symbolic Graph Framework for Time Series Forecasting	提出DRAGON，利用多变量de Bruijn图解决时间序列预测中符号结构缺失问题	foundation model	✅
15	FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference	FlashFormer：用于高效低批量推理的全模型融合Kernel	large language model
16	Update Your Transformer to the Latest Release: Re-Basin of Task Vectors	提出TransFusion，通过重构任务向量实现Transformer模型微调知识的无数据迁移。	foundation model	✅
17	Adaptive Budget Allocation for Orthogonal-Subspace Adapter Tuning in LLMs Continual Learning	提出OA-Adapter以解决LLMs持续学习中的预算分配问题	large language model
18	MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning	提出MoRE：一种低秩专家混合模型，用于自适应多任务学习。	large language model
19	Detecting Undesired Process Behavior by Means of Retrieval Augmented Generation	提出基于检索增强生成(RAG)的方法，无需微调即可检测流程中不期望的行为。	large language model
20	ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning	ACE：探索激活余弦相似性和方差，实现LLM精确高效的校准剪枝	large language model
21	LLMs Judging LLMs: A Simplex Perspective	提出几何贝叶斯方法以评估大型语言模型的输出质量	large language model
22	FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design	FALCON：全自动布局约束模拟电路设计的机器学习框架	foundation model

🔬 支柱二：RL算法与架构 (RL & Architecture) (20 篇)

#	题目	一句话要点	标签	🔗
23	LLM-ODDR: A Large Language Model Framework for Joint Order Dispatching and Driver Repositioning	提出LLM-ODDR框架，利用大语言模型解决网约车订单分配与司机调度联合优化问题	reinforcement learning spatiotemporal large language model
24	A Closer Look at Multimodal Representation Collapse	揭示多模态表征坍塌机理，提出显式基向量重分配算法以提升多模态融合性能。	distillation multimodal	✅
25	Scaling Offline RL via Efficient and Expressive Shortcut Models	提出SORL算法，利用高效且富有表现力的捷径模型扩展离线强化学习。	reinforcement learning offline RL offline reinforcement learning
26	SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning	提出SOReL和TOReL，解决离线强化学习中超参数调优和性能评估难题。	reinforcement learning offline RL offline reinforcement learning	✅
27	Estimating the Effects of Sample Training Orders for Large Language Models without Retraining	提出一种免重训练框架，用于评估大语言模型训练样本顺序的影响	curriculum learning large language model
28	Preference Learning with Response Time: Robust Losses and Guarantees	提出基于响应时间的偏好学习方法，提升奖励模型学习的样本效率与理论保证。	preference learning foundation model
29	Skywork Open Reasoner 1 Technical Report	Skywork-OR1：通过强化学习提升长CoT模型推理能力，显著超越同规模模型。	reinforcement learning large language model chain-of-thought
30	Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding	提出DRG-Sapphire，利用强化学习解决LLM在DRG编码中的分布外推理难题。	reinforcement learning large language model
31	SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training	提出SDPO以解决扩散模型训练中的偏差与不稳定问题	preference learning DPO direct preference optimization
32	Scaling Reasoning without Attention	提出无注意力语言模型以解决推理效率低下问题	Mamba state space model large language model
33	Two-Stage Feature Generation with Transformer and Reinforcement Learning	提出基于Transformer和强化学习的两阶段特征生成框架，提升模型性能和适应性。	reinforcement learning PPO
34	A Provable Approach for End-to-End Safe Reinforcement Learning	提出PLS：一种可证明的端到端安全强化学习方法，确保学习和部署全过程的安全性。	reinforcement learning
35	Contraction Actor-Critic: Contraction Metric-Guided Reinforcement Learning for Robust Path Tracking	提出Contraction Actor-Critic算法，用于未知动力学下的鲁棒路径跟踪。	reinforcement learning
36	The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models	针对推理语言模型，提出基于熵机制的强化学习方法，解决策略熵坍塌问题。	reinforcement learning
37	Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation	提出物理信息蒸馏方法以解决扩散模型中的PDE约束问题	distillation
38	When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?	探讨神经进化在迁移学习任务中超越强化学习的能力	reinforcement learning
39	An Augmentation-Aware Theory for Self-Supervised Contrastive Learning	提出一种数据增强感知的自监督对比学习理论框架，显式建模数据增强的影响。	contrastive learning
40	Weakly-Supervised Contrastive Learning for Imprecise Class Labels	提出基于图的弱监督对比学习框架，解决标签不准确情况下的表征学习问题	contrastive learning	✅
41	FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators	FNOPE：利用傅里叶神经算子进行函数空间上的模拟推断，提升时空过程建模效率。	flow matching spatiotemporal
42	Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training	改进群体相对策略优化：探索其在On-Policy和Off-Policy训练中的应用	reinforcement learning PPO

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
43	Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis	提出DST模型，通过分解和时空图分析预测多元城市数据	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
44	Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection	提出基于伪数据注入的随机Bandit算法对抗攻击方法	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-05-28）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (22 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (20 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理