cs.LG(2026-02-13)

📊 共 27 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (9) 支柱八:物理动画 (Physics-based Animation) (2) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs 揭示RL微调视觉语言模型在推理一致性与鲁棒性上的脆弱性,并提出改进方向。 reinforcement learning large language model multimodal
2 Constraint-Rectified Training for Efficient Chain-of-Thought 提出约束校正训练(CRT),提升思维链(CoT)推理效率并控制推理长度。 reinforcement learning reward design large language model
3 Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models 提出Amortized Reasoning Tree Search (ARTS),解耦大语言模型中的提议与决策过程。 reinforcement learning flow matching large language model
4 Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models Flow-Factory:统一强化学习框架,加速Flow-Matching模型与人类偏好对齐 reinforcement learning flow matching
5 Order Matters in Retrosynthesis: Structure-aware Generation via Reaction-Center-Guided Discrete Flow Matching 提出反应中心引导的离散流匹配方法RetroDiT,用于结构感知的逆合成生成。 flow matching foundation model
6 Multi-Agent Model-Based Reinforcement Learning with Joint State-Action Learned Embeddings 提出基于联合状态-动作学习嵌入的多智能体模型强化学习框架,提升协作效率。 reinforcement learning world model representation learning
7 X-VORTEX: Spatio-Temporal Contrastive Learning for Wake Vortex Trajectory Forecasting X-VORTEX:时空对比学习用于尾流涡旋轨迹预测 contrastive learning
8 SLA2: Sparse-Linear Attention with Learnable Routing and QAT SLA2:结合可学习路由与量化感知训练的稀疏线性注意力,加速视频扩散模型。 linear attention
9 Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL 提出基于分层强化学习的Introspective LLM,从LLM内部状态学习温度策略 reinforcement learning large language model
10 Flow Matching from Viewpoint of Proximal Operators 基于近端算子的视角重构条件流匹配,提升生成模型性能 flow matching
11 VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction VI-CuRL:通过置信度引导的方差缩减稳定无验证器强化学习推理 reinforcement learning large language model
12 FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching FLAC:通过动能正则化桥匹配实现最大熵强化学习 reinforcement learning flow matching

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
13 Quantization-Aware Collaborative Inference for Large Embodied AI Models 提出量化感知协同推理,优化边缘具身智能大模型的推理性能 embodied AI
14 Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity 提出raFLoRA,解决联邦低秩适配中因客户端异构性导致的秩坍塌问题 foundation model
15 Quantization-Robust LLM Unlearning via Low-Rank Adaptation 提出基于LoRA的量化鲁棒LLM遗忘方法,解决低比特量化掩盖遗忘更新的问题。 large language model
16 LCSB: Layer-Cyclic Selective Backpropagation for Memory-Efficient On-Device LLM Fine-Tuning 提出层循环选择反向传播(LCSB),实现低内存设备上LLM高效微调 large language model
17 Memory-Efficient Structured Backpropagation for On-Device LLM Fine-Tuning 提出内存高效的结构化反向传播以解决设备端LLM微调问题 large language model
18 GPTZero: Robust Detection of LLM-Generated Texts GPTZero:一种鲁棒的LLM生成文本检测方案,提升对抗攻击和释义的鲁棒性。 large language model
19 Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures 提出基于退火的变分推断方法,缓解高斯混合模型中的模式崩塌问题 multimodal
20 Closing the Loop: A Control-Theoretic Framework for Provably Stable Time Series Forecasting with LLMs 提出F-LLM:基于控制理论的LLM时间序列预测闭环稳定框架 large language model
21 SD-MoE: Spectral Decomposition for Effective Expert Specialization SD-MoE:通过谱分解实现专家高效特化,提升MoE模型性能 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
22 AMPS: Adaptive Modality Preference Steering via Functional Entropy 提出AMPS,通过功能熵自适应地调整多模态大语言模型的模态偏好。 AMP large language model multimodal
23 Selection of CMIP6 Models for Regional Precipitation Projection and Climate Change Assessment in the Jhelum and Chenab River Basins 提出基于机器学习的GCM选择方法以改善水资源管理 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
24 Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics ULD:通过潜在动态统一模型无关效率与模型基表示,实现跨域强化学习。 locomotion reinforcement learning latent dynamics
25 Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL 提出基于生成式情景引导的双粒度对比奖励方法,提升具身强化学习效率。 manipulation reinforcement learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
26 EXCODER: EXplainable Classification Of DiscretE time series Representations EXCODER:利用离散时间序列表征提升时间序列分类的可解释性 VQ-VAE

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
27 SWING: Unlocking Implicit Graph Representations for Graph Random Features SWING:解锁隐式图表示的图随机特征计算方法 implicit representation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页