cs.LG(2026-04-13)

📊 共 22 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (10 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
1 Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach 提出MedSSR框架,利用知识增强数据合成和半监督强化学习提升医疗推理能力 reinforcement learning distillation large language model
2 The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping 提出MEDS框架,通过记忆增强动态奖励塑造提升LLM采样多样性,减少重复错误。 reinforcement learning reward design reward shaping
3 Probabilistic Prediction of Neural Dynamics via Autoregressive Flow Matching 提出基于自回归Flow Matching的神经动力学概率预测框架,提升脑活动预测精度。 flow matching multimodal
4 DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO DDO-RM:一种针对LLM偏好优化的极简留出基准,对比DPO DPO direct preference optimization
5 DIB-OD: Preserving the Invariant Core for Robust Heterogeneous Graph Adaptation via Decoupled Information Bottleneck and Online Distillation DIB-OD:通过解耦信息瓶颈和在线蒸馏实现异构图鲁棒自适应 teacher-student distillation
6 Autonomous Diffractometry Enabled by Visual Reinforcement Learning 提出基于视觉强化学习的自主衍射系统,无需晶体学知识即可实现晶体自动对准。 reinforcement learning
7 Quantum-Gated Task-interaction Knowledge Distillation for Pre-trained Model-based Class-Incremental Learning 提出量子门控任务交互知识蒸馏框架,解决预训练模型在类增量学习中的灾难性遗忘问题。 distillation
8 Physics-Informed State Space Models for Reliable Solar Irradiance Forecasting in Off-Grid Systems 提出热力学液流形网络,用于解决离网系统中可靠的太阳辐射预测问题。 state space model
9 Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration 提出NExt框架,通过非线性外推低秩轨迹加速LLM的RLVR训练。 reinforcement learning large language model
10 Rethinking Token-Level Credit Assignment in RLVR: A Polarity-Entropy Analysis 提出熵感知策略优化EAPO,解决RLVR中token级别信用分配问题 reinforcement learning large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
11 Bottleneck Tokens for Unified Multimodal Retrieval 提出Bottleneck Tokens (BToks)用于统一多模态检索,解决信息聚合和token级别指导问题。 large language model multimodal
12 SCNO: Spiking Compositional Neural Operator -- Towards a Neuromorphic Foundation Model for Nuclear PDE Solving 提出SCNO:一种神经形态核偏微分方程求解基础模型,实现模块化、低功耗和零遗忘扩展。 foundation model
13 CausalGaze: Unveiling Hallucinations via Counterfactual Graph Intervention in Large Language Models CausalGaze:通过反事实图干预揭示大语言模型中的幻觉问题 large language model
14 Human Centered Non Intrusive Driver State Modeling Using Personalized Physiological Signals in Real World Automated Driving 提出基于个性化生理信号的非侵入式驾驶员状态建模方法,提升自动驾驶安全性 multimodal
15 A Mechanistic Analysis of Looped Reasoning Language Models 机制分析揭示循环推理语言模型层内固定点与推理阶段的对应关系 large language model
16 TempusBench: An Evaluation Framework for Time-Series Forecasting 提出TempusBench时间序列预测评估框架,解决现有评估体系的不足。 foundation model
17 Flow-Controlled Scheduling for LLM Inference with Provable Stability Guarantees 提出基于流控制的LLM推理调度算法,保证系统稳定性并提升吞吐 large language model
18 UniPROT: Uniform Prototype Selection via Partial Optimal Transport with Submodular Guarantees UniPROT:基于部分最优传输和次模保证的均匀原型选择方法 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
19 Solving Physics Olympiad via Reinforcement Learning on Physics Simulators 利用物理模拟器和强化学习解决物理奥赛难题 sim-to-real reinforcement learning
20 Robust Adversarial Policy Optimization Under Dynamics Uncertainty 提出鲁棒对抗策略优化以解决动态不确定性问题 domain randomization reinforcement learning

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
21 GPU Acceleration of Sparse Fully Homomorphic Encrypted DNNs 提出基于GPU加速的稀疏全同态加密DNN矩阵乘法优化方法 OMOMO

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 Universality of first-order methods on random and deterministic matrices 通过分析流量分布,提升一阶方法在随机和确定性矩阵上的通用性 AMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页