cs.LG（2025-01-28）

📊 共 20 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (6) 支柱一：机器人控制 (Robot Control) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models	Mamba-Shedder：用于高效选择性结构化状态空间模型的Transformer后压缩	Mamba SSM state space model	✅
2	Decoding Human Preferences in Alignment: An Improved Approach to Inverse Constitutional AI	改进逆向宪法AI方法，提升从偏好数据集中提取原则的准确性和泛化性	reinforcement learning RLHF DPO
3	On the Interplay Between Sparsity and Training in Deep Reinforcement Learning	研究稀疏架构在深度强化学习中的作用，提升图像领域任务性能	reinforcement learning deep reinforcement learning
4	Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies	分析强化学习在DeepSeek-R1模型安全对齐中的局限性，提出混合训练方案	reinforcement learning large language model
5	Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning	提出HAPFL，通过自适应双智能体强化学习实现异构环境下个性化联邦学习。	reinforcement learning PPO distillation
6	TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models	提出TAID以解决语言模型蒸馏中的容量差异问题	distillation foundation model
7	On Rollouts in Model-Based Reinforcement Learning	提出Infoprop，分离模型不确定性，提升基于模型的强化学习rollout质量。	reinforcement learning policy learning
8	Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning	研究最大熵强化学习在混沌动力系统中的泛化性和鲁棒性	reinforcement learning
9	Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	评估LLM在高级数学问题求解中的token再生能力与领域偏差	Mamba large language model
10	Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning	提出一种基于功能模块的强化学习可解释性分析流程	reinforcement learning
11	Flow Matching: Markov Kernels, Stochastic Processes and Transport Plans	Flow Matching：通过马尔可夫核、随机过程和传输计划学习生成模型的速度场	flow matching
12	Safe Reinforcement Learning for Real-World Engine Control	提出基于安全监控的强化学习工具链，用于真实发动机控制	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
13	Fine-Tuned Language Models as Space Systems Controllers	利用微调语言模型作为空间系统控制器	large language model foundation model
14	Optimizing Large Language Model Training Using FP4 Quantization	首个LLM的FP4训练框架，通过创新量化方法实现精度与效率的平衡。	large language model
15	LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience	提出基于LLM的异常检测服务，提升云基础设施的可靠性。	large language model foundation model
16	Deep-and-Wide Learning: Enhancing Data-Driven Inference via Synergistic Learning of Inter- and Intra-Data Representations	提出深度-宽度学习(DWL)框架，通过协同学习数据内和数据间表征提升数据驱动推理。	foundation model
17	Exponential Family Attention	提出指数族注意力（EFA）模型，用于处理混合数据类型的高维序列数据。	large language model
18	Sparse Autoencoders Trained on the Same Data Learn Different Features	稀疏自编码器在相同数据上训练会学习到不同的特征表示。	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Increasing Information for Model Predictive Control with Semi-Markov Decision Processes	利用半马尔可夫决策过程，提升模型预测控制的信息增益	model predictive control

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Nonlinear dynamics of localization in neural receptive fields	揭示非线性动力学如何驱动神经感受野的局部化涌现	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页