cs.LG(2026-04-27)

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (10 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (8) 支柱一:机器人控制 (Robot Control) (3 🔗1) 支柱八:物理动画 (Physics-based Animation) (3)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
1 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents AgenticCache:面向具身AI代理的缓存驱动异步规划框架 embodied AI large language model
2 A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws 提出极限理论以理解基础模型中的涌现智能 foundation model
3 FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training FlashOverlap:通过最小化尾部延迟优化分布式LLM训练中的通信重叠 large language model
4 Learning to Think from Multiple Thinkers 研究多思维链(CoT)学习,提出高效主动学习算法解决思维差异性难题 chain-of-thought
5 Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware 提出一种小样本跨设备迁移学习方法,用于量子噪声建模与误差缓解。 zero-shot transfer
6 Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment 提出Meta-Aligner,通过双向偏好-策略优化实现多目标LLM对齐 large language model
7 Explaining Temporal Graph Predictions With Shapley Values 提出基于Shapley值的时序图神经网络可解释性方法 TAMP
8 Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels 提出COVERCAL,通过加权集合覆盖优化离群通道,提升后训练量化校准效果。 large language model
9 Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction 提出迭代优化文本方向的IRTD方法,用于安全多轮代码修正 large language model
10 Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning 提出校准回放方法,解决终身学习LLM微调中覆盖率早于准确率崩溃的问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
11 BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment BitRL:利用1-bit量化语言模型实现资源受限边缘设备上的强化学习 reinforcement learning large language model
12 A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning 提出基于免奖励学习的多目标强化学习方法,提升策略学习效率和性能。 reinforcement learning policy learning
13 An Aircraft Upset Recovery System with Reinforcement Learning 提出基于强化学习的飞机姿态异常恢复系统,提升飞行安全 reinforcement learning SAC
14 An Automatic Ground Collision Avoidance System with Reinforcement Learning 提出基于强化学习的自动地面防撞系统,提升高级教练机的安全性与作战效能。 reinforcement learning
15 Perfecting Aircraft Maneuvers with Reinforcement Learning 利用强化学习优化飞机特技动作,辅助飞行员训练 reinforcement learning
16 TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents 提出TCOD,通过时序课程学习解决多轮自主Agent在线蒸馏中的KL不稳定性问题 distillation
17 GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility 提出GradMAP,通过梯度多智能体近端学习实现电网边缘灵活性控制。 reinforcement learning PPO
18 Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach 提出基于相对熵逆强化学习的投资者偏好推断方法,无需已知转移概率。 reinforcement learning inverse reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
19 Dual Control of Linear Systems from Bilinear Observations with Belief Space Model Predictive Control 提出基于信念空间模型预测控制的双线性观测线性系统双重控制方法 MPC model predictive control
20 SpecRLBench: A Benchmark for Generalization in Specification-Guided Reinforcement Learning SpecRLBench:用于评估基于线性时序逻辑的强化学习泛化能力的基准测试 manipulation reinforcement learning
21 Leveraging Human Feedback for Semantically-Relevant Skill Discovery 提出语义相关技能发现(SRSD),利用人类反馈提升强化学习技能发现的语义多样性和相关性。 locomotion reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
22 Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction 提出任务引导的时空网络,结合扩散增强,用于脑电图的老年痴呆症诊断和MMSE预测。 spatiotemporal
23 IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting IMPA-Net:气象感知多尺度注意力与动态损失用于极端对流雷达临近预报 spatiotemporal
24 Multi-scale Dynamic Wake Modeling of Floating Offshore Wind Turbines via Fourier Neural Operators and Physics-Informed Neural Networks 利用傅里叶神经算子预测漂浮式海上风机多尺度动态尾流,实现实时控制与优化。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页