cs.LG（2026-04-13）

📊 共 22 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (10 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱一：机器人控制 (Robot Control) (2 🔗1) 支柱五：交互与反应 (Interaction & Reaction) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach	提出MedSSR框架，利用知识增强数据合成和半监督强化学习提升医疗推理能力	reinforcement learning distillation large language model	✅
2	The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping	提出MEDS框架，通过记忆增强动态奖励塑造提升LLM采样多样性，减少重复错误。	reinforcement learning reward design reward shaping
3	Probabilistic Prediction of Neural Dynamics via Autoregressive Flow Matching	提出基于自回归Flow Matching的神经动力学概率预测框架，提升脑活动预测精度。	flow matching multimodal
4	DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO	DDO-RM：一种针对LLM偏好优化的极简留出基准，对比DPO	DPO direct preference optimization
5	DIB-OD: Preserving the Invariant Core for Robust Heterogeneous Graph Adaptation via Decoupled Information Bottleneck and Online Distillation	DIB-OD：通过解耦信息瓶颈和在线蒸馏实现异构图鲁棒自适应	teacher-student distillation
6	Autonomous Diffractometry Enabled by Visual Reinforcement Learning	提出基于视觉强化学习的自主衍射系统，无需晶体学知识即可实现晶体自动对准。	reinforcement learning
7	Quantum-Gated Task-interaction Knowledge Distillation for Pre-trained Model-based Class-Incremental Learning	提出量子门控任务交互知识蒸馏框架，解决预训练模型在类增量学习中的灾难性遗忘问题。	distillation
8	Physics-Informed State Space Models for Reliable Solar Irradiance Forecasting in Off-Grid Systems	提出热力学液流形网络，用于解决离网系统中可靠的太阳辐射预测问题。	state space model
9	Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration	提出NExt框架，通过非线性外推低秩轨迹加速LLM的RLVR训练。	reinforcement learning large language model	✅
10	Rethinking Token-Level Credit Assignment in RLVR: A Polarity-Entropy Analysis	提出熵感知策略优化EAPO，解决RLVR中token级别信用分配问题	reinforcement learning large language model

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
11	Bottleneck Tokens for Unified Multimodal Retrieval	提出Bottleneck Tokens (BToks)用于统一多模态检索，解决信息聚合和token级别指导问题。	large language model multimodal
12	SCNO: Spiking Compositional Neural Operator -- Towards a Neuromorphic Foundation Model for Nuclear PDE Solving	提出SCNO：一种神经形态核偏微分方程求解基础模型，实现模块化、低功耗和零遗忘扩展。	foundation model
13	CausalGaze: Unveiling Hallucinations via Counterfactual Graph Intervention in Large Language Models	CausalGaze：通过反事实图干预揭示大语言模型中的幻觉问题	large language model
14	Human Centered Non Intrusive Driver State Modeling Using Personalized Physiological Signals in Real World Automated Driving	提出基于个性化生理信号的非侵入式驾驶员状态建模方法，提升自动驾驶安全性	multimodal
15	A Mechanistic Analysis of Looped Reasoning Language Models	机制分析揭示循环推理语言模型层内固定点与推理阶段的对应关系	large language model
16	TempusBench: An Evaluation Framework for Time-Series Forecasting	提出TempusBench时间序列预测评估框架，解决现有评估体系的不足。	foundation model	✅
17	Flow-Controlled Scheduling for LLM Inference with Provable Stability Guarantees	提出基于流控制的LLM推理调度算法，保证系统稳定性并提升吞吐	large language model
18	UniPROT: Uniform Prototype Selection via Partial Optimal Transport with Submodular Guarantees	UniPROT：基于部分最优传输和次模保证的均匀原型选择方法	large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Solving Physics Olympiad via Reinforcement Learning on Physics Simulators	利用物理模拟器和强化学习解决物理奥赛难题	sim-to-real reinforcement learning	✅
20	Robust Adversarial Policy Optimization Under Dynamics Uncertainty	提出鲁棒对抗策略优化以解决动态不确定性问题	domain randomization reinforcement learning

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
21	GPU Acceleration of Sparse Fully Homomorphic Encrypted DNNs	提出基于GPU加速的稀疏全同态加密DNN矩阵乘法优化方法	OMOMO

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
22	Universality of first-order methods on random and deterministic matrices	通过分析流量分布，提升一阶方法在随机和确定性矩阵上的通用性	AMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页