cs.LG(2025-05-06)

📊 共 25 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (11 🔗4) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Policy-labeled Preference Learning: Is Preference Enough for RLHF? 提出策略标记偏好学习(PPL),通过后悔建模解决RLHF中的似然不匹配问题 reinforcement learning offline RL preference learning
2 DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning 提出DYSTIL,利用大语言模型动态诱导策略,提升强化学习泛化性和效率。 reinforcement learning large language model
3 Sustainable Smart Farm Networks: Enhancing Resilience and Efficiency with Decision Theory-Guided Deep Reinforcement Learning 提出决策理论指导的深度强化学习方法,提升智能农场网络在对抗攻击和能源约束下的韧性和效率。 reinforcement learning deep reinforcement learning DRL
4 Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach 提出基于深度强化学习的无人机辅助SWIPT-MEC联合资源管理方案,提升能量效率和终端续航。 reinforcement learning deep reinforcement learning SAC
5 Interpretable Learning Dynamics in Unsupervised Reinforcement Learning 提出URL智能体可解释性框架,分析内驱动机对智能体行为和表征学习的影响 reinforcement learning PPO representation learning
6 Ergodic Generative Flows 提出遍历生成流(EGFs)以解决生成流网络在连续环境和模仿学习中的训练难题。 reinforcement learning imitation learning flow matching
7 A new membership inference attack that spots memorization in generative and predictive models: Loss-Based with Reference Model algorithm (LBRM) 提出LBRM算法,通过参考模型提升生成模型记忆化训练数据的检测精度。 predictive model
8 Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance 提出基于余弦距离潜在表征对齐的知识蒸馏语音降噪方法 distillation
9 Absolute Zero: Reinforced Self-play Reasoning with Zero Data 提出Absolute Zero:一种无需外部数据的自博弈强化学习推理方法 reinforcement learning large language model
10 Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting 提出动态调整知识蒸馏平衡参数方法,提升学生网络训练效率 distillation
11 Unraveling the Rainbow: can value-based methods schedule? 探索价值方法在调度问题中的潜力:价值学习算法在Job-Shop调度问题中表现优异 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)

#题目一句话要点标签🔗
12 Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey 综述:知识增强的大语言模型在复杂问题求解中的应用 large language model chain-of-thought
13 Geospatial Mechanistic Interpretability of Large Language Models 提出地理空间机制可解释性框架,解析大型语言模型如何处理地理信息 large language model foundation model
14 Task-Oriented Multimodal Token Transmission in Resource-Constrained Multiuser Networks 提出面向任务的多模态Token传输方案,解决资源受限多用户网络中的带宽开销问题。 multimodal
15 Automatic Calibration for Membership Inference Attack on Large Language Models 提出ACMIA:一种自动校准的LLM成员推断攻击方法,提升攻击可靠性。 large language model
16 Adversarial Attacks in Multimodal Systems: A Practitioner's Survey 综述多模态系统中的对抗攻击,填补实践者视角空白,助力模型安全部署。 multimodal
17 Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment 揭示模型反演评估的误导性标准,提出基于MLLM的可靠隐私评估框架 large language model multimodal
18 MARCO: Multi-Agent Code Optimization with Real-Time Knowledge Integration for High-Performance Computing MARCO:基于实时知识集成的高性能计算多智能体代码优化框架 large language model
19 Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimation 提出基于自适应退火策略的 Normalizing Flow 训练方法,缓解模式崩塌问题,加速参数估计。 multimodal
20 PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model 提出PARM:通过偏好感知自回归奖励模型实现多目标测试时对齐 large language model
21 SPAP: Structured Pruning via Alternating Optimization and Penalty Methods 提出SPAP框架,通过交替优化和惩罚方法实现大语言模型高效结构化剪枝。 large language model
22 Plug-and-Play AMC: Context Is King in Training-Free, Open-Set Modulation with LLMs 提出一种免训练的即插即用AMC方法,利用LLM在开放集调制分类中实现卓越性能。 foundation model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
23 Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning 提出基于零反事实交互的逆向重标记方法,提升目标条件强化学习在物体交互场景中的样本效率。 locomotion reinforcement learning
24 Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability 提出VAE因果干预框架,用于模型机制可解释性分析 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
25 Quantum Feature Space of a Qubit Coupled to an Arbitrary Bath 提出量子特征空间,高效表征量子比特与任意环境耦合的噪声过程,无需复杂神经网络。 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页