cs.LG（2026-03-12）

📊 共 38 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (19) 支柱九：具身大模型 (Embodied Foundation Models) (16) 支柱一：机器人控制 (Robot Control) (1 🔗1) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning	简单配方有效：视觉-语言-动作模型是基于强化学习的自然持续学习器	reinforcement learning vision-language-action VLA
2	Statistical and structural identifiability in representation learning	提出统计和结构可辨识性概念，提升表征学习模型的稳定性和可解释性	representation learning MAE foundation model
3	ARROW: Augmented Replay for RObust World models	ARROW：通过增强回放提升世界模型的鲁棒性，解决持续强化学习中的灾难性遗忘问题	reinforcement learning world model dreamer
4	FlexRec: Adapting LLM-based Recommenders for Flexible Needs via Reinforcement Learning	提出FlexRec以解决动态推荐系统的灵活需求问题	reinforcement learning instruction following
5	Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy Optimization	提出混合能量感知奖励塑造(H-EARS)，提升强化学习在连续控制中的效率与安全性。	reinforcement learning deep reinforcement learning reward shaping
6	IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL	提出IsoCompute Playbook，优化LLM强化学习中采样计算的分配策略。	reinforcement learning large language model
7	AGMARL-DKS: An Adaptive Graph-Enhanced Multi-Agent Reinforcement Learning for Dynamic Kubernetes Scheduling	提出AGMARL-DKS，用于动态Kubernetes调度，提升资源利用率和容错性。	reinforcement learning
8	Causal Representation Learning with Optimal Compression under Complex Treatments	提出基于最优压缩的因果表征学习方法，解决复杂干预下的个体处理效应估计问题	representation learning
9	Disentangled Representation Learning through Unsupervised Symmetry Group Discovery	提出基于无监督对称群发现的解耦表示学习方法	representation learning
10	Entropy-Preserving Reinforcement Learning	提出REPO和ADAPO算法，解决策略梯度算法训练中探索多样性降低的问题	reinforcement learning
11	Separable neural architectures as a primitive for unified predictive and generative intelligence	提出可分离神经架构(SNA)，统一预测和生成智能，适用于物理、语言和感知等领域。	reinforcement learning spatiotemporal
12	Temporal Straightening for Latent Planning	提出时序拉直方法，提升世界模型中隐空间规划的性能。	world model representation learning
13	Automatic Generation of High-Performance RL Environments	提出一种自动生成高性能强化学习环境的通用方法，显著降低开发成本和时间。	reinforcement learning PPO
14	SpectralGuard: Detecting Memory Collapse Attacks in State Space Models	提出SpectralGuard，用于检测状态空间模型中的内存崩溃攻击	Mamba SSM state space model
15	Generalist Large Language Models for Molecular Property Prediction: Distilling Knowledge from Specialist Models	提出TreeKD知识蒸馏方法，提升通用大语言模型在分子性质预测中的性能	distillation large language model
16	Thermodynamics of Reinforcement Learning Curricula	利用非平衡热力学，提出强化学习课程学习的几何框架，优化任务调度。	reinforcement learning representation learning curriculum learning
17	Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting	提出NeST-S6模型，利用嵌套记忆和空间偏微分方程感知选择性状态空间模型，解决移动流量网格预测问题。	Mamba SSM MAE
18	Curriculum Sampling: A Two-Phase Curriculum for Efficient Training of Flow Matching	提出课程采样方法，通过两阶段训练策略提升Flow Matching模型的训练效率和生成质量。	flow matching
19	Probing Length Generalization in Mamba via Image Reconstruction	通过图像重建探究Mamba模型在长度泛化上的局限性	Mamba

🔬 支柱九：具身大模型 (Embodied Foundation Models) (16 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding	提出Chem4DLLM，用于理解化学动态过程的4D多模态大语言模型。	large language model multimodal
21	Cornserve: A Distributed Serving System for Any-to-Any Multimodal Models	提出Cornserve以解决Any-to-Any多模态模型服务问题	multimodal
22	Exhaustive Circuit Mapping of a Single-Cell Foundation Model Reveals Massive Redundancy, Heavy-Tailed Hub Architecture, and Layer-Dependent Differentiation Control	通过全电路映射揭示单细胞基础模型中的冗余性、重尾枢纽架构和层依赖分化控制	foundation model
23	Wasserstein Gradient Flows for Batch Bayesian Optimal Experimental Design	提出基于Wasserstein梯度流的批量贝叶斯最优实验设计方法，提升高维非凸优化效率。	multimodal
24	Resource-Efficient Iterative LLM-Based NAS with Feedback Memory	提出基于反馈记忆的迭代LLM驱动NAS方法，在单GPU上实现资源高效的网络架构搜索。	large language model
25	Frequentist Consistency of Prior-Data Fitted Networks for Causal Inference	提出基于PFN的因果推断ATE估计校准方法，解决先验诱导的混淆偏差问题	foundation model
26	MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?	MobileKernelBench：评估大语言模型在移动设备上生成高效内核的能力，并提出MoKA解决现有LLM的不足。	large language model
27	Language Generation with Replay: A Learning-Theoretic View of Model Collapse	从学习理论角度分析语言生成模型崩塌问题，提出基于回放的对抗学习框架。	large language model
28	KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation	提出KEPo：针对图结构RAG的知识演化投毒攻击方法	large language model
29	Deep Learning Network-Temporal Models For Traffic Prediction	提出基于深度学习网络-时序模型的交通预测方法，提升复杂网络数据预测精度。	large language model
30	Learning Pore-scale Multiphase Flow from 4D Velocimetry	提出多模态学习框架以解决多相流在孔隙介质中的预测问题	multimodal
31	TaxBreak: Unmasking the Hidden Costs of LLM Inference Through Overhead Decomposition	TaxBreak：通过开销分解揭示LLM推理的隐藏成本，优化Host-Device平衡	large language model
32	Overcoming the Modality Gap in Context-Aided Forecasting	提出半合成数据增强方法，解决上下文辅助预测中模态鸿沟问题	multimodal
33	KernelFoundry: Hardware-aware evolutionary GPU kernel optimization	KernelFoundry：硬件感知的进化式GPU Kernel优化框架	large language model
34	NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation	NeuroLoRA：基于上下文感知神经调控的参数高效多任务自适应方法	large language model
35	Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency	提出GER-steer，通过跨层一致性优化激活Steering向量，提升大语言模型控制的可靠性。	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
36	Cross-Domain Policy Optimization via Bellman Consistency and Hybrid Critics	提出基于贝尔曼一致性和混合评论家的跨域策略优化方法，提升强化学习数据效率。	locomotion manipulation reinforcement learning	✅

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
37	Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models	提出能量基微调（EBFT），通过特征匹配优化语言模型序列级行为。	feature matching

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
38	Uncovering Locally Low-dimensional Structure in Networks by Locally Optimal Spectral Embedding	提出局部邻接谱嵌入(LASE)，解决全局谱嵌入在复杂网络中局部几何结构模糊问题	ASE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页