cs.LG(2026-01-20)

📊 共 24 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一:机器人控制 (Robot Control) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems LLMOrbit:大型语言模型循环分类法,应对扩展壁垒并迈向Agentic AI系统 PPO RLHF DPO
2 Attention-Based Offline Reinforcement Learning and Clustering for Interpretable Sepsis Treatment 提出基于注意力机制的离线强化学习与聚类方法,用于可解释的脓毒症治疗决策支持。 reinforcement learning offline reinforcement learning large language model
3 Spatiotemporal Wildfire Prediction and Reinforcement Learning for Helitack Suppression FireCastRL:结合时空预测与强化学习的野火主动抑制框架 reinforcement learning spatiotemporal
4 RL-BioAug: Label-Efficient Reinforcement Learning for Self-Supervised EEG Representation Learning 提出RL-BioAug,利用强化学习进行脑电信号自监督表征学习,提升数据增强效果。 reinforcement learning representation learning contrastive learning
5 KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning 提出KAGE-Bench,用于快速评估强化学习中已知轴视觉泛化能力 reinforcement learning PPO latent dynamics
6 Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow Jet-RL:通过统一训练和Rollout精度流实现On-Policy FP8强化学习 reinforcement learning large language model
7 InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning InT:通过自提议干预实现LLM推理中的信用分配 reinforcement learning IMoS large language model
8 VJEPA: Variational Joint Embedding Predictive Architectures as Probabilistic World Models VJEPA:概率世界模型,通过变分联合嵌入预测架构实现稳健的不确定性感知规划。 world model representation learning
9 Differentiated Pickup Point Offering for Emission Reduction in Last-Mile Delivery 提出差异化自提点推荐策略,降低末端配送碳排放 reinforcement learning DPO spatial relationship
10 Q-learning with Adjoint Matching 提出基于伴随匹配的Q学习(QAM),高效优化连续动作空间中的扩散策略。 reinforcement learning diffusion policy flow matching
11 Report for NSF Workshop on AI for Electronic Design Automation 探索AI赋能电子设计自动化:面临挑战与未来机遇 reinforcement learning large language model
12 Reinforcement Learning for Opportunistic Routing in Software-Defined LEO-Terrestrial Systems 提出基于强化学习的机会路由以解决LEO网络数据传输延迟问题 reinforcement learning
13 Sample Complexity of Average-Reward Q-Learning: From Single-agent to Federated Reinforcement Learning 提出平均奖励Q学习算法以解决样本复杂度问题 reinforcement learning
14 GeoDynamics: A Geometric State-Space Neural Network for Understanding Brain Dynamics on Riemannian Manifolds GeoDynamics:一种用于理解黎曼流形上大脑动态的几何状态空间神经网络 SSM spatiotemporal

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
15 A Unified Variational Imputation Framework for Electric Vehicle Charging Data Using Retrieval-Augmented Language Model 提出PRAIM框架,利用检索增强语言模型解决电动汽车充电数据缺失问题 large language model multimodal
16 Preconditioning Benefits of Spectral Orthogonalization in Muon 通过谱正交化,Muon优化器在矩阵分解和上下文学习中实现与条件数无关的线性收敛。 large language model
17 Multi-Objective Hierarchical Optimization with Large Language Models 提出基于大语言模型的多目标分层优化方法,提升复杂问题求解效率。 large language model
18 Layer-adaptive Expert Pruning for Pre-Training of Mixture-of-Experts Large Language Models 提出层自适应专家剪枝算法,提升MoE大语言模型预训练效率 large language model
19 SCG With Your Phone: Diagnosis of Rhythmic Spectrum Disorders in Field Conditions 提出一种基于智能手机SCG信号的深度学习框架,用于诊断节律谱紊乱 multimodal TAMP
20 Search over Self-Edit Strategies for LLM Adaptation 提出基于LLM自编辑策略搜索的自适应框架,提升模型在知识整合任务中的性能。 foundation model
21 LLM Security and Safety: Insights from Homotopy-Inspired Prompt Obfuscation 提出基于同伦启发的提示混淆框架,增强对LLM安全漏洞的理解。 large language model
22 ELSA: Efficient LLM-Centric Split Aggregation for Privacy-Aware Hierarchical Federated Learning over Resource-Constrained Edge Networks ELSA:面向资源受限边缘网络的隐私感知分层联邦学习高效LLM中心化聚合 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
23 Uncovering and Understanding FPR Manipulation Attack in Industrial IoT Networks 揭示并理解工业物联网网络中基于MQTT协议的FPR操纵攻击 manipulation
24 Cosmo-FOLD: Fast generation and upscaling of field-level cosmological maps with overlap latent diffusion Cosmo-FOLD:利用重叠潜在扩散快速生成和放大场级宇宙学图 MPC

⬅️ 返回 cs.LG 首页 · 🏠 返回主页