cs.LG(2026-02-15)

📊 共 23 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 You Can Learn Tokenization End-to-End with Reinforcement Learning 提出基于强化学习的端到端分词方法,提升大语言模型性能 reinforcement learning large language model
2 Zero-Shot Instruction Following in RL via Structured LTL Representations 提出基于结构化LTL表示的零样本强化学习指令跟随方法 reinforcement learning instruction following
3 EIDOS: Latent-Space Predictive Learning for Time Series Foundation Models EIDOS:面向时间序列基础模型的潜空间预测学习框架 latent dynamics foundation model
4 Deep Dense Exploration for LLM Reinforcement Learning via Pivot-Driven Resampling 提出深度密集探索以解决大语言模型强化学习中的探索问题 reinforcement learning policy learning large language model
5 Train Less, Learn More: Adaptive Efficient Rollout Optimization for Group-Based Reinforcement Learning AERO:自适应高效Rollout优化,提升基于群组强化学习的LLM微调效率 reinforcement learning large language model
6 DeepFusion: Accelerating MoE Training via Federated Knowledge Distillation from Heterogeneous Edge Devices DeepFusion:通过联邦知识蒸馏加速异构边缘设备上的MoE模型训练 distillation large language model
7 KernelBlaster: Continual Cross-Task CUDA Optimization via Memory-Augmented In-Context Reinforcement Learning KernelBlaster:通过内存增强的上下文强化学习实现CUDA跨任务持续优化 reinforcement learning large language model
8 QuRL: Efficient Reinforcement Learning with Quantized Rollout QuRL:通过量化Rollout加速可验证奖励强化学习训练 reinforcement learning large language model
9 Conformal Signal Temporal Logic for Robust Reinforcement Learning Control: A Case Study 提出基于Conformal STL Shield的鲁棒强化学习控制方法,提升飞行控制可靠性 reinforcement learning PPO
10 Radial-VCReg: More Informative Representation Learning Through Radial Gaussianization 提出Radial-VCReg,通过径向高斯化学习更具信息量的自监督表征 representation learning
11 Experiential Reinforcement Learning 提出经验强化学习(ERL),通过显式经验反思循环提升语言模型在稀疏奖励环境下的学习效率。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
12 A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization 针对非随机初始化,提出基于早停的LLM微调理论框架 large language model
13 A Multi-Agent Framework for Code-Guided, Modular, and Verifiable Automated Machine Learning iML:一个代码引导、模块化和可验证的自动化机器学习多智能体框架 large language model
14 S2SServiceBench: A Multimodal Benchmark for Last-Mile S2S Climate Services S2SServiceBench:用于末端S2S气候服务的多模态基准测试 large language model multimodal
15 Multi-Agent Debate: A Unified Agentic Framework for Tabular Anomaly Detection 提出MAD多智能体辩论框架,用于提升表格异常检测的鲁棒性和可解释性。 large language model foundation model
16 Floe: Federated Specialization for Real-Time LLM-SLM Inference Floe:面向实时LLM-SLM推理的联邦专用化框架 large language model
17 Machine Learning as a Tool (MLAT): A Framework for Integrating Statistical ML Models as Callable Tools within LLM Agent Workflows 提出MLAT框架,将预训练ML模型作为LLM Agent工作流中的可调用工具,实现上下文推理。 large language model
18 Whom to Query for What: Adaptive Group Elicitation via Multi-Turn LLM Interactions 提出基于多轮LLM交互的自适应群体信息获取方法,解决预算约束下的群体属性推断问题。 large language model
19 Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws 基于函数缩放律优化批量大小调度,实现快速追赶与延迟切换 large language model
20 ROAST: Rollout-based On-distribution Activation Steering Technique 提出ROAST:一种基于模型自身rollout的LLM激活调控技术,提升任务性能。 large language model
21 MC$^2$Mark: Distortion-Free Multi-Bit Watermarking for Long Messages 提出MC$^2$Mark以解决长消息水印嵌入质量与强度问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
22 WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control WIMLE:基于IMLE和不确定性感知的世界模型,提升连续控制样本效率 humanoid reinforcement learning world model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
23 KoopGen: Koopman Generator Networks for Representing and Predicting Dynamical Systems with Continuous Spectra 提出KoopGen以解决高维动态系统预测问题 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页