cs.LG(2025-01-28)

📊 共 20 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (6) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models Mamba-Shedder:用于高效选择性结构化状态空间模型的Transformer后压缩 Mamba SSM state space model
2 Decoding Human Preferences in Alignment: An Improved Approach to Inverse Constitutional AI 改进逆向宪法AI方法,提升从偏好数据集中提取原则的准确性和泛化性 reinforcement learning RLHF DPO
3 On the Interplay Between Sparsity and Training in Deep Reinforcement Learning 研究稀疏架构在深度强化学习中的作用,提升图像领域任务性能 reinforcement learning deep reinforcement learning
4 Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies 分析强化学习在DeepSeek-R1模型安全对齐中的局限性,提出混合训练方案 reinforcement learning large language model
5 Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning 提出HAPFL,通过自适应双智能体强化学习实现异构环境下个性化联邦学习。 reinforcement learning PPO distillation
6 TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models 提出TAID以解决语言模型蒸馏中的容量差异问题 distillation foundation model
7 On Rollouts in Model-Based Reinforcement Learning 提出Infoprop,分离模型不确定性,提升基于模型的强化学习rollout质量。 reinforcement learning policy learning
8 Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning 研究最大熵强化学习在混沌动力系统中的泛化性和鲁棒性 reinforcement learning
9 Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving 评估LLM在高级数学问题求解中的token再生能力与领域偏差 Mamba large language model
10 Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning 提出一种基于功能模块的强化学习可解释性分析流程 reinforcement learning
11 Flow Matching: Markov Kernels, Stochastic Processes and Transport Plans Flow Matching:通过马尔可夫核、随机过程和传输计划学习生成模型的速度场 flow matching
12 Safe Reinforcement Learning for Real-World Engine Control 提出基于安全监控的强化学习工具链,用于真实发动机控制 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
13 Fine-Tuned Language Models as Space Systems Controllers 利用微调语言模型作为空间系统控制器 large language model foundation model
14 Optimizing Large Language Model Training Using FP4 Quantization 首个LLM的FP4训练框架,通过创新量化方法实现精度与效率的平衡。 large language model
15 LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience 提出基于LLM的异常检测服务,提升云基础设施的可靠性。 large language model foundation model
16 Deep-and-Wide Learning: Enhancing Data-Driven Inference via Synergistic Learning of Inter- and Intra-Data Representations 提出深度-宽度学习(DWL)框架,通过协同学习数据内和数据间表征提升数据驱动推理。 foundation model
17 Exponential Family Attention 提出指数族注意力(EFA)模型,用于处理混合数据类型的高维序列数据。 large language model
18 Sparse Autoencoders Trained on the Same Data Learn Different Features 稀疏自编码器在相同数据上训练会学习到不同的特征表示。 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
19 Increasing Information for Model Predictive Control with Semi-Markov Decision Processes 利用半马尔可夫决策过程,提升模型预测控制的信息增益 model predictive control

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
20 Nonlinear dynamics of localization in neural receptive fields 揭示非线性动力学如何驱动神经感受野的局部化涌现 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页