cs.LG（2025-12-04）

📊 共 10 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (9 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Coefficient of Variation Masking: A Volatility-Aware Strategy for EHR Foundation Models	提出变异系数掩码（CV-Masking）策略，提升EHR基础模型在波动性生物标志物上的表征能力。	masked autoencoder MAE foundation model
2	MemLoRA: Distilling Expert Adapters for On-Device Memory Systems	MemLoRA：为端侧内存系统蒸馏专家适配器，实现高效本地部署。	distillation large language model multimodal
3	Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective	提出广义解耦知识蒸馏(GDKD)，从预测分布角度提升知识迁移效果	distillation multimodal	✅
4	Natural Language Actor-Critic: Scalable Off-Policy Learning in Language Space	提出自然语言Actor-Critic算法，解决LLM Agent在语言空间中的可扩展离线学习问题	policy learning large language model
5	SHAP-Guided Kernel Actor-Critic for Explainable Reinforcement Learning	提出基于SHAP引导的核Actor-Critic算法，提升强化学习的可解释性与性能	reinforcement learning
6	One-Step Diffusion Samplers via Self-Distillation and Deterministic Flow	提出基于自蒸馏和确定性流的单步扩散采样器，加速采样并稳定证据估计。	distillation
7	Hierarchical Reinforcement Learning for the Dynamic VNE with Alternatives Problem	提出HRL-VNEAP，利用分层强化学习解决具有替代拓扑的动态VNE问题	reinforcement learning
8	CARL: Focusing Agentic Reinforcement Learning on Critical Actions	CARL：聚焦关键动作的Agent强化学习，提升长时程推理性能。	reinforcement learning
9	Enhancing Deep Deterministic Policy Gradients on Continuous Control Tasks with Decoupled Prioritized Experience Replay	提出解耦优先级经验回放(DPER)算法，提升DDPG在连续控制任务中的性能。	reinforcement learning deep reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
10	David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design?	利用Agentic AI框架，小模型在硬件设计任务中达到近大模型性能	large language model foundation model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页