cs.LG(2025-12-04)

📊 共 10 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 Coefficient of Variation Masking: A Volatility-Aware Strategy for EHR Foundation Models 提出变异系数掩码(CV-Masking)策略,提升EHR基础模型在波动性生物标志物上的表征能力。 masked autoencoder MAE foundation model
2 MemLoRA: Distilling Expert Adapters for On-Device Memory Systems MemLoRA:为端侧内存系统蒸馏专家适配器,实现高效本地部署。 distillation large language model multimodal
3 Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective 提出广义解耦知识蒸馏(GDKD),从预测分布角度提升知识迁移效果 distillation multimodal
4 Natural Language Actor-Critic: Scalable Off-Policy Learning in Language Space 提出自然语言Actor-Critic算法,解决LLM Agent在语言空间中的可扩展离线学习问题 policy learning large language model
5 SHAP-Guided Kernel Actor-Critic for Explainable Reinforcement Learning 提出基于SHAP引导的核Actor-Critic算法,提升强化学习的可解释性与性能 reinforcement learning
6 One-Step Diffusion Samplers via Self-Distillation and Deterministic Flow 提出基于自蒸馏和确定性流的单步扩散采样器,加速采样并稳定证据估计。 distillation
7 Hierarchical Reinforcement Learning for the Dynamic VNE with Alternatives Problem 提出HRL-VNEAP,利用分层强化学习解决具有替代拓扑的动态VNE问题 reinforcement learning
8 CARL: Focusing Agentic Reinforcement Learning on Critical Actions CARL:聚焦关键动作的Agent强化学习,提升长时程推理性能。 reinforcement learning
9 Enhancing Deep Deterministic Policy Gradients on Continuous Control Tasks with Decoupled Prioritized Experience Replay 提出解耦优先级经验回放(DPER)算法,提升DDPG在连续控制任务中的性能。 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (1 篇)

#题目一句话要点标签🔗
10 David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design? 利用Agentic AI框架,小模型在硬件设计任务中达到近大模型性能 large language model foundation model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页