cs.LG(2025-09-18)
📊 共 13 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (7 🔗2)
支柱九:具身大模型 (Embodied Foundation Models) (3)
支柱八:物理动画 (Physics-based Animation) (2)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (7 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Self-Improving Embodied Foundation Models | 提出一种自提升具身基础模型方法,用于机器人自主技能学习与泛化。 | reinforcement learning imitation learning large language model | ||
| 2 | Exploring multimodal implicit behavior learning for vehicle navigation in simulated cities | 提出数据增强隐式行为克隆,解决城市车辆导航中的多模态决策问题 | behavior cloning multimodal | ||
| 3 | Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning | Fleming-R1:通过强化学习实现专家级医学推理 | reinforcement learning large language model chain-of-thought | ||
| 4 | FlowRL: Matching Reward Distributions for LLM Reasoning | FlowRL:通过匹配奖励分布提升LLM推理能力,解决过优化问题。 | reinforcement learning PPO large language model | ||
| 5 | ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning | 提出DSCL框架,通过双重动态采样与课程学习提升RL工具学习效率。 | reinforcement learning curriculum learning | ||
| 6 | Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation | EVOL-RL:一种无标签自进化语言模型框架,通过多数投票选择和新颖性驱动变异实现模型提升。 | reinforcement learning large language model | ✅ | |
| 7 | Mind the Gap: Data Rewriting for Stable Off-Policy Supervised Fine-Tuning | 提出数据重写框架,解决SFT中Off-Policy学习的分布偏移问题 | policy learning large language model | ✅ |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | Temporal Reasoning with Large Language Models Augmented by Evolving Knowledge Graphs | 提出EvoReasoner和EvoKG,增强LLM在时序知识图谱上的推理能力。 | large language model | ||
| 9 | CoopQ: Cooperative Game Inspired Layerwise Mixed Precision Quantization for LLMs | 提出CoopQ以解决LLMs低资源部署中的混合精度量化问题 | large language model | ||
| 10 | Predicting Language Models' Success at Zero-Shot Probabilistic Prediction | 研究LLM在零样本概率预测中的性能,并提出无标签指标预测LLM在特定任务上的表现。 | large language model |
🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 11 | Solar Forecasting with Causality: A Graph-Transformer Approach to Spatiotemporal Dependencies | SolarCAST:利用因果图Transformer预测太阳辐射,提升可再生能源管理 | spatiotemporal multimodal | ||
| 12 | Accurate typhoon intensity forecasts using a non-iterative spatiotemporal transformer model | 提出TIFNet,一种非迭代时空Transformer模型,显著提升台风强度预测精度。 | spatiotemporal |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization | 提出基于扩散模型的场景树生成框架DST,用于多元时间序列预测和多阶段随机优化。 | MPC reinforcement learning |