cs.LG(2026-01-21)

📊 共 22 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱四:生成式动作 (Generative Motion) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 PCL-Reasoner-V1.5: Advancing Math Reasoning with Offline Reinforcement Learning PCL-Reasoner-V1.5:利用离线强化学习提升数学推理能力 reinforcement learning offline RL offline reinforcement learning
2 A Curriculum-Based Deep Reinforcement Learning Framework for the Electric Vehicle Routing Problem 提出基于课程学习的深度强化学习框架,解决电动汽车路径规划问题。 reinforcement learning deep reinforcement learning DRL
3 Multimodal Rumor Detection Enhanced by External Evidence and Forgery Features 提出融合外部证据与伪造特征的多模态谣言检测模型,提升社交媒体谣言识别精度。 contrastive learning multimodal
4 CLEANER: Self-Purified Trajectories Boost Agentic Reinforcement Learning CLEANER:自净化轨迹提升Agentic强化学习性能 reinforcement learning large language model
5 Plug-and-Play Benchmarking of Reinforcement Learning Algorithms for Large-Scale Flow Control FluidGym:首个完全可微的强化学习流体控制基准测试平台 reinforcement learning PPO SAC
6 CoScale-RL: Efficient Post-Training by Co-Scaling Data and Computation CoScale-RL:通过协同缩放数据和计算,高效地进行大模型后训练。 reinforcement learning distillation foundation model
7 Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data 基于结果的强化学习能使Transformer推理,但需合适数据 reinforcement learning chain-of-thought
8 Memory Retention Is Not Enough to Master Memory Tasks in Reinforcement Learning 提出记忆重写基准测试,揭示现有强化学习记忆模型的局限性 reinforcement learning
9 Beyond Error-Based Optimization: Experience-Driven Symbolic Regression with Goal-Conditioned Reinforcement Learning 提出EGRL-SR框架以解决符号回归中的搜索效率问题 reinforcement learning
10 What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study 针对推理LLM,提出一种高效的低比特量化感知训练方法Reasoning-QAT。 reinforcement learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
11 Overcoming In-Memory Bottlenecks in Graph Foundation Models via Retrieval-Augmented Generation 提出RAG-GFM,通过检索增强生成克服图基础模型中的内存瓶颈。 foundation model
12 InstructTime++: Time Series Classification with Multimodal Language Modeling via Implicit Feature Enhancement InstructTime++:通过隐式特征增强,利用多模态语言模型进行时间序列分类 multimodal
13 Counterfactual Modeling with Fine-Tuned LLMs for Health Intervention Design and Sensor Data Augmentation 利用微调LLM进行健康干预设计和传感器数据增强的反事实建模 large language model multimodal
14 Mixture-of-Experts Models in Vision: Routing, Optimization, and Generalization 研究视觉MoE模型:路由、优化与泛化,揭示其在图像分类中的行为特性 large language model
15 SmartOracle - An Agentic Approach to Mitigate Noise in Differential Oracles SmartOracle:一种基于Agent的差分Oracle噪声缓解方法 large language model
16 Fine-Grained Traceability for Transparent ML Pipelines FG-Trac:为机器学习流水线建立可验证的细粒度溯源框架 multimodal
17 Tailoring Adverse Event Prediction in Type 1 Diabetes with Patient-Specific Deep Learning Models 提出基于患者特定数据的深度学习模型,用于改善1型糖尿病不良事件预测。 multimodal
18 Adaptive Exponential Integration for Stable Gaussian Mixture Black-Box Variational Inference 提出自适应指数积分方法,稳定高效地进行高斯混合黑盒变分推断。 multimodal
19 Reflecting in the Reflection: Integrating a Socratic Questioning Framework into Automated AI-Based Question Generation 提出基于反思的反思框架,利用苏格拉底式提问自动生成高质量反思问题。 large language model
20 Variance-Adaptive Muon: Accelerating LLM Pretraining with NSR-Modulated and Variance-Scaled Momentum 提出方差自适应Muon优化器,加速LLM预训练并降低验证损失。 large language model

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
21 Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models 揭示后训练中自回归到掩码扩散语言模型的机制转变 MDM

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
22 HyperNet-Adaptation for Diffusion-Based Test Case Generation 提出HyNeA,通过超网络自适应扩散模型,高效生成深度学习测试用例 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页