cs.LG(2025-05-19)
📊 共 19 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning | 提出策略驱动的世界模型自适应方法,提升离线MBRL在噪声环境下的鲁棒性 | reinforcement learning policy learning offline reinforcement learning | ||
| 12 | Modular Diffusion Policy Training: Decoupling and Recombining Guidance and Diffusion for Offline RL | 提出模块化扩散策略训练,解耦引导与扩散模型,提升离线强化学习性能。 | reinforcement learning offline RL diffusion policy | ||
| 13 | Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning | 提出时间距离感知的迁移增强方法TempDATA,解决离线MBRL在稀疏奖励、长程任务中的难题。 | reinforcement learning offline reinforcement learning model-based RL | ||
| 14 | HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity | 构建HR-VILAGE-3K3M,用于呼吸道病毒免疫纵向基因表达的AI驱动系统免疫研究。 | predictive model foundation model multimodal | ||
| 15 | 4Hammer: a board-game reinforcement learning environment for the hour long time frame | 提出4Hammer环境,用于评估强化学习和LLM在长时程复杂棋盘游戏中的表现 | reinforcement learning large language model | ||
| 16 | Mean Flows for One-step Generative Modeling | 提出MeanFlow模型,通过平均速度建模实现高效单步生成建模,显著提升图像生成质量。 | flow matching curriculum learning distillation | ||
| 17 | RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs | 分析RL微调LLM的结构性假设,揭示其退化为监督学习的本质 | reinforcement learning large language model | ||
| 18 | Optimizing Anytime Reasoning via Budget Relative Policy Optimization | 提出AnytimeReasoner,通过预算相对策略优化提升LLM在不同计算预算下的推理性能。 | reinforcement learning large language model | ||
| 19 | One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling | 提出基于Koopman理论的扩散模型单步离线蒸馏方法KDM,加速生成过程。 | distillation |