cs.LG(2026-03-12)

📊 共 38 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (19) 支柱九:具身大模型 (Embodied Foundation Models) (16) 支柱一:机器人控制 (Robot Control) (1 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (19 篇)

#题目一句话要点标签🔗
1 Simple Recipe Works: Vision-Language-Action Models are Natural Continual Learners with Reinforcement Learning 简单配方有效:视觉-语言-动作模型是基于强化学习的自然持续学习器 reinforcement learning vision-language-action VLA
2 Statistical and structural identifiability in representation learning 提出统计和结构可辨识性概念,提升表征学习模型的稳定性和可解释性 representation learning MAE foundation model
3 ARROW: Augmented Replay for RObust World models ARROW:通过增强回放提升世界模型的鲁棒性,解决持续强化学习中的灾难性遗忘问题 reinforcement learning world model dreamer
4 FlexRec: Adapting LLM-based Recommenders for Flexible Needs via Reinforcement Learning 提出FlexRec以解决动态推荐系统的灵活需求问题 reinforcement learning instruction following
5 Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy Optimization 提出混合能量感知奖励塑造(H-EARS),提升强化学习在连续控制中的效率与安全性。 reinforcement learning deep reinforcement learning reward shaping
6 IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL 提出IsoCompute Playbook,优化LLM强化学习中采样计算的分配策略。 reinforcement learning large language model
7 AGMARL-DKS: An Adaptive Graph-Enhanced Multi-Agent Reinforcement Learning for Dynamic Kubernetes Scheduling 提出AGMARL-DKS,用于动态Kubernetes调度,提升资源利用率和容错性。 reinforcement learning
8 Causal Representation Learning with Optimal Compression under Complex Treatments 提出基于最优压缩的因果表征学习方法,解决复杂干预下的个体处理效应估计问题 representation learning
9 Disentangled Representation Learning through Unsupervised Symmetry Group Discovery 提出基于无监督对称群发现的解耦表示学习方法 representation learning
10 Entropy-Preserving Reinforcement Learning 提出REPO和ADAPO算法,解决策略梯度算法训练中探索多样性降低的问题 reinforcement learning
11 Separable neural architectures as a primitive for unified predictive and generative intelligence 提出可分离神经架构(SNA),统一预测和生成智能,适用于物理、语言和感知等领域。 reinforcement learning spatiotemporal
12 Temporal Straightening for Latent Planning 提出时序拉直方法,提升世界模型中隐空间规划的性能。 world model representation learning
13 Automatic Generation of High-Performance RL Environments 提出一种自动生成高性能强化学习环境的通用方法,显著降低开发成本和时间。 reinforcement learning PPO
14 SpectralGuard: Detecting Memory Collapse Attacks in State Space Models 提出SpectralGuard,用于检测状态空间模型中的内存崩溃攻击 Mamba SSM state space model
15 Generalist Large Language Models for Molecular Property Prediction: Distilling Knowledge from Specialist Models 提出TreeKD知识蒸馏方法,提升通用大语言模型在分子性质预测中的性能 distillation large language model
16 Thermodynamics of Reinforcement Learning Curricula 利用非平衡热力学,提出强化学习课程学习的几何框架,优化任务调度。 reinforcement learning representation learning curriculum learning
17 Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting 提出NeST-S6模型,利用嵌套记忆和空间偏微分方程感知选择性状态空间模型,解决移动流量网格预测问题。 Mamba SSM MAE
18 Curriculum Sampling: A Two-Phase Curriculum for Efficient Training of Flow Matching 提出课程采样方法,通过两阶段训练策略提升Flow Matching模型的训练效率和生成质量。 flow matching
19 Probing Length Generalization in Mamba via Image Reconstruction 通过图像重建探究Mamba模型在长度泛化上的局限性 Mamba

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
20 Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding 提出Chem4DLLM,用于理解化学动态过程的4D多模态大语言模型。 large language model multimodal
21 Cornserve: A Distributed Serving System for Any-to-Any Multimodal Models 提出Cornserve以解决Any-to-Any多模态模型服务问题 multimodal
22 Exhaustive Circuit Mapping of a Single-Cell Foundation Model Reveals Massive Redundancy, Heavy-Tailed Hub Architecture, and Layer-Dependent Differentiation Control 通过全电路映射揭示单细胞基础模型中的冗余性、重尾枢纽架构和层依赖分化控制 foundation model
23 Wasserstein Gradient Flows for Batch Bayesian Optimal Experimental Design 提出基于Wasserstein梯度流的批量贝叶斯最优实验设计方法,提升高维非凸优化效率。 multimodal
24 Resource-Efficient Iterative LLM-Based NAS with Feedback Memory 提出基于反馈记忆的迭代LLM驱动NAS方法,在单GPU上实现资源高效的网络架构搜索。 large language model
25 Frequentist Consistency of Prior-Data Fitted Networks for Causal Inference 提出基于PFN的因果推断ATE估计校准方法,解决先验诱导的混淆偏差问题 foundation model
26 MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices? MobileKernelBench:评估大语言模型在移动设备上生成高效内核的能力,并提出MoKA解决现有LLM的不足。 large language model
27 Language Generation with Replay: A Learning-Theoretic View of Model Collapse 从学习理论角度分析语言生成模型崩塌问题,提出基于回放的对抗学习框架。 large language model
28 KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation 提出KEPo:针对图结构RAG的知识演化投毒攻击方法 large language model
29 Deep Learning Network-Temporal Models For Traffic Prediction 提出基于深度学习网络-时序模型的交通预测方法,提升复杂网络数据预测精度。 large language model
30 Learning Pore-scale Multiphase Flow from 4D Velocimetry 提出多模态学习框架以解决多相流在孔隙介质中的预测问题 multimodal
31 TaxBreak: Unmasking the Hidden Costs of LLM Inference Through Overhead Decomposition TaxBreak:通过开销分解揭示LLM推理的隐藏成本,优化Host-Device平衡 large language model
32 Overcoming the Modality Gap in Context-Aided Forecasting 提出半合成数据增强方法,解决上下文辅助预测中模态鸿沟问题 multimodal
33 KernelFoundry: Hardware-aware evolutionary GPU kernel optimization KernelFoundry:硬件感知的进化式GPU Kernel优化框架 large language model
34 NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation NeuroLoRA:基于上下文感知神经调控的参数高效多任务自适应方法 large language model
35 Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency 提出GER-steer,通过跨层一致性优化激活Steering向量,提升大语言模型控制的可靠性。 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
36 Cross-Domain Policy Optimization via Bellman Consistency and Hybrid Critics 提出基于贝尔曼一致性和混合评论家的跨域策略优化方法,提升强化学习数据效率。 locomotion manipulation reinforcement learning

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
37 Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models 提出能量基微调(EBFT),通过特征匹配优化语言模型序列级行为。 feature matching

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
38 Uncovering Locally Low-dimensional Structure in Networks by Locally Optimal Spectral Embedding 提出局部邻接谱嵌入(LASE),解决全局谱嵌入在复杂网络中局部几何结构模糊问题 ASE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页