cs.LG(2026-06-09)

📊 共 35 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (14) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning 提出Bootstrapped Flow Q-Learning以解决离线强化学习中的计算复杂性问题 reinforcement learning policy learning offline reinforcement learning
2 When to Align, When to Predict: A Phase Diagram for Multimodal Learning 提出统一框架以优化多模态学习中的对齐与预测 representation learning multimodal
3 Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning 提出QGF算法以解决强化学习中的政策优化问题 reinforcement learning policy learning offline RL
4 TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning 提出TRACE框架以解决多轮强化学习中的预算分配问题 reinforcement learning large language model
5 AuRA: Internalizing Audio Understanding into LLMs as LoRA 提出AuRA以解决音频理解与大语言模型结合的效率问题 distillation large language model multimodal
6 Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models 提出Flow-DPPO以解决流匹配模型的策略优化问题 reinforcement learning PPO flow matching
7 Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations 提出FPQC-SAC以解决低信噪比金融强化学习中的偏差问题 reinforcement learning deep reinforcement learning SAC
8 Beyond Absolute Imitation: Anchored Residual Guidance for Privileged On-Policy Distillation 提出AR-OPD以解决教师-学生模型间的推理不匹配问题 teacher-student distillation privileged information
9 Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning 提出CPPO以解决LLM强化学习中的信任区域问题 reinforcement learning PPO
10 Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication 提出深度强化学习框架以优化半导体制造中的长时间控制问题 reinforcement learning deep reinforcement learning
11 One Step Closer to Ground Truth: A Multi-Scale Residual-Aware Representation Learning Pipeline for Predicting Time Series Data 提出多尺度残差感知表示学习管道以改进时间序列预测 representation learning MAE
12 Machine Learning Methods for Studying Latent Neural Activity Dynamics 综述潜在神经活动动态的机器学习方法 latent dynamics contrastive learning foundation model
13 On-sky demonstration of reinforcement learning for adaptive optics control 提出PO4AO以解决自适应光学控制中的实时优化问题 reinforcement learning
14 Geometry-Aware Reinforcement Learning for 2D Irregular Nesting 提出几何感知强化学习以解决2D不规则排版问题 reinforcement learning
15 How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs 提出FlowTracer以解决大语言模型中的强化学习信用分配问题 reinforcement learning large language model
16 Revisiting Positive Samples in Graph Contrastive Learning: From the Perspective of Message Passing 提出SPGCL以解决图对比学习中正样本利用不足的问题 contrastive learning
17 Flexible Flows for Biological Sequence Design 提出灵活流动模型以优化生物序列设计 flow matching classifier-free guidance
18 Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output 提出基于表示的优势估计以提升人类反馈强化学习效果 reinforcement learning RLHF

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
19 CITRAS-FM: Tiny Time Series Foundation Model for Covariate-Informed Zero-Shot Forecasting 提出CITRAS-FM以解决时间序列零-shot预测中的计算成本问题 foundation model
20 MemVenom: Triggered Poisoning of Multimodal Memories in Web Agents 提出MemVenom以解决多模态记忆中恶意内容注入问题 multimodal
21 Towards Diverse Scientific Hypothesis Search with Large Language Models 提出一种基于大语言模型的多样化科学假设搜索方法 large language model
22 A Unified Adaptive Feature Composition Framework for Multi-Task Generalization in Wireless Foundation Models 提出统一自适应特征组合框架以解决无线基础模型多任务泛化问题 foundation model
23 Baseline-Free Policy Optimization for Neural Combinatorial Optimization 提出无基线策略优化以解决神经组合优化问题 large language model
24 CLP: Collocation-Length Prediction for Zero-Loss Adaptive Multi-Token Inference 提出CLP以解决自回归解码中的多标记推理问题 large language model
25 Optimal Post-Training Quantization Scales and Where to Find Them 提出PiSO算法以优化后训练量化的权重缩放因子 large language model
26 N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization 提出N-GRPO以解决数学推理中的多样性与一致性问题 large language model
27 Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey 提出资源约束下的高效LLM训练方法以解决数据、内存和计算瓶颈问题 large language model
28 Do LLMsMakeNeural Distinguishers Wise? 提出基于大语言模型的神经区分器以增强密码分析能力 large language model
29 Causal Ensemble Agent: Hierarchical Causal Discovery with LLM-guided Expert Reweighting 提出Causal Ensemble Agent以解决因果发现中的不一致性问题 large language model
30 Stop Early, Spend Less: Hidden-State Probes as a Practical Recipe for Streaming Moderation of LLM Outputs 提出隐状态探针以实现高效的LLM输出流式安全过滤 large language model
31 Advancing the State-of-the-Art in Empirical Privacy Auditing 提出合成示例以增强隐私审计的有效性 large language model
32 ERAlign: Energy-based Representation Alignment of GNNs and LLMs on Text-attributed Graphs 提出ERAlign框架以解决GNN与LLM在文本属性图上的表示对齐问题 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
33 Transformer Based Model for Spatiotemporal Feature Learning in EEG Emotion Recognition 提出EEG-TransNet以提升脑电图情感识别的准确性 spatiotemporal
34 MoE Enhanced Federated Learning for Spatiotemporal Prediction 提出MoE-FedTP以解决跨城市交通预测中的数据稀缺问题 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
35 MODIP: Efficient Model-Based Optimization for Diffusion Policies 提出MODIP框架以高效优化扩散策略的在线微调 MPC model predictive control reinforcement learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页