cs.LG（2026-06-09）

📊 共 35 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (18 🔗3) 支柱九：具身大模型 (Embodied Foundation Models) (14) 支柱八：物理动画 (Physics-based Animation) (2) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (18 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning	提出Bootstrapped Flow Q-Learning以解决离线强化学习中的计算复杂性问题	reinforcement learning policy learning offline reinforcement learning
2	When to Align, When to Predict: A Phase Diagram for Multimodal Learning	提出统一框架以优化多模态学习中的对齐与预测	representation learning multimodal	✅
3	Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning	提出QGF算法以解决强化学习中的政策优化问题	reinforcement learning policy learning offline RL
4	TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning	提出TRACE框架以解决多轮强化学习中的预算分配问题	reinforcement learning large language model
5	AuRA: Internalizing Audio Understanding into LLMs as LoRA	提出AuRA以解决音频理解与大语言模型结合的效率问题	distillation large language model multimodal
6	Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models	提出Flow-DPPO以解决流匹配模型的策略优化问题	reinforcement learning PPO flow matching	✅
7	Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations	提出FPQC-SAC以解决低信噪比金融强化学习中的偏差问题	reinforcement learning deep reinforcement learning SAC	✅
8	Beyond Absolute Imitation: Anchored Residual Guidance for Privileged On-Policy Distillation	提出AR-OPD以解决教师-学生模型间的推理不匹配问题	teacher-student distillation privileged information
9	Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning	提出CPPO以解决LLM强化学习中的信任区域问题	reinforcement learning PPO
10	Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication	提出深度强化学习框架以优化半导体制造中的长时间控制问题	reinforcement learning deep reinforcement learning
11	One Step Closer to Ground Truth: A Multi-Scale Residual-Aware Representation Learning Pipeline for Predicting Time Series Data	提出多尺度残差感知表示学习管道以改进时间序列预测	representation learning MAE
12	Machine Learning Methods for Studying Latent Neural Activity Dynamics	综述潜在神经活动动态的机器学习方法	latent dynamics contrastive learning foundation model
13	On-sky demonstration of reinforcement learning for adaptive optics control	提出PO4AO以解决自适应光学控制中的实时优化问题	reinforcement learning
14	Geometry-Aware Reinforcement Learning for 2D Irregular Nesting	提出几何感知强化学习以解决2D不规则排版问题	reinforcement learning
15	How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs	提出FlowTracer以解决大语言模型中的强化学习信用分配问题	reinforcement learning large language model
16	Revisiting Positive Samples in Graph Contrastive Learning: From the Perspective of Message Passing	提出SPGCL以解决图对比学习中正样本利用不足的问题	contrastive learning
17	Flexible Flows for Biological Sequence Design	提出灵活流动模型以优化生物序列设计	flow matching classifier-free guidance
18	Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output	提出基于表示的优势估计以提升人类反馈强化学习效果	reinforcement learning RLHF

🔬 支柱九：具身大模型 (Embodied Foundation Models) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
19	CITRAS-FM: Tiny Time Series Foundation Model for Covariate-Informed Zero-Shot Forecasting	提出CITRAS-FM以解决时间序列零-shot预测中的计算成本问题	foundation model
20	MemVenom: Triggered Poisoning of Multimodal Memories in Web Agents	提出MemVenom以解决多模态记忆中恶意内容注入问题	multimodal
21	Towards Diverse Scientific Hypothesis Search with Large Language Models	提出一种基于大语言模型的多样化科学假设搜索方法	large language model
22	A Unified Adaptive Feature Composition Framework for Multi-Task Generalization in Wireless Foundation Models	提出统一自适应特征组合框架以解决无线基础模型多任务泛化问题	foundation model
23	Baseline-Free Policy Optimization for Neural Combinatorial Optimization	提出无基线策略优化以解决神经组合优化问题	large language model
24	CLP: Collocation-Length Prediction for Zero-Loss Adaptive Multi-Token Inference	提出CLP以解决自回归解码中的多标记推理问题	large language model
25	Optimal Post-Training Quantization Scales and Where to Find Them	提出PiSO算法以优化后训练量化的权重缩放因子	large language model
26	N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization	提出N-GRPO以解决数学推理中的多样性与一致性问题	large language model
27	Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey	提出资源约束下的高效LLM训练方法以解决数据、内存和计算瓶颈问题	large language model
28	Do LLMsMakeNeural Distinguishers Wise?	提出基于大语言模型的神经区分器以增强密码分析能力	large language model
29	Causal Ensemble Agent: Hierarchical Causal Discovery with LLM-guided Expert Reweighting	提出Causal Ensemble Agent以解决因果发现中的不一致性问题	large language model
30	Stop Early, Spend Less: Hidden-State Probes as a Practical Recipe for Streaming Moderation of LLM Outputs	提出隐状态探针以实现高效的LLM输出流式安全过滤	large language model
31	Advancing the State-of-the-Art in Empirical Privacy Auditing	提出合成示例以增强隐私审计的有效性	large language model
32	ERAlign: Energy-based Representation Alignment of GNNs and LLMs on Text-attributed Graphs	提出ERAlign框架以解决GNN与LLM在文本属性图上的表示对齐问题	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Transformer Based Model for Spatiotemporal Feature Learning in EEG Emotion Recognition	提出EEG-TransNet以提升脑电图情感识别的准确性	spatiotemporal
34	MoE Enhanced Federated Learning for Spatiotemporal Prediction	提出MoE-FedTP以解决跨城市交通预测中的数据稀缺问题	spatiotemporal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
35	MODIP: Efficient Model-Based Optimization for Diffusion Policies	提出MODIP框架以高效优化扩散策略的在线微调	MPC model predictive control reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页