cs.LG(2025-05-26)

📊 共 23 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO 理论分析RLHF与DPO在偏好学习中的性能差距,揭示模型偏差影响 reinforcement learning preference learning RLHF
2 Alignment of large language models with constrained learning 提出基于拉格朗日对偶的LLM对齐方法,解决约束条件下奖励最大化问题 RLHF large language model
3 Learning a Pessimistic Reward Model in RLHF 提出PET悲观奖励模型微调方法,提升RLHF中奖励模型的鲁棒性,抵抗奖励篡改。 reinforcement learning offline reinforcement learning RLHF
4 Rotary Masked Autoencoders are Versatile Learners 提出Rotary Masked Autoencoder (RoMAE),用于处理包含连续位置信息的多模态数据。 representation learning masked autoencoder MAE
5 JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning 提出JEDI:通过端到端隐空间扩散模型缓解基于模型的强化学习中Agent-Human性能不对称问题 reinforcement learning world model
6 The Limits of Preference Data for Post-Training 揭示偏好数据在后训练中优化复杂任务的局限性 reinforcement learning RLHF large language model
7 An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning 提出基于强化学习优化LLM推理的神经退行性痴呆可解释诊断框架 reinforcement learning distillation large language model
8 DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning DISCOVER:面向稀疏奖励强化学习的自动化课程学习方法 reinforcement learning
9 Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees 提出一种等变表示学习框架,用于具备保证的对称感知推理。 representation learning
10 Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning 提出基于强化学习的文本到多视角扩散模型优化框架,提升图像质量和视角一致性 reinforcement learning
11 The challenge of hidden gifts in multi-agent reinforcement learning 针对多智能体强化学习中“隐藏礼物”问题,提出自学习感知修正策略梯度方法 reinforcement learning
12 Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning 提出基于贝叶斯自适应强化学习的BARL算法,提升LLM推理中的反思性探索能力 reinforcement learning large language model
13 Characterizing Pattern Matching and Its Limits on Compositional Task Structures 提出模式匹配的形式化框架以解决组合任务中的泛化问题 Mamba chain-of-thought
14 ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining 提出ESLM:一种风险规避的选择性语言模型预训练方法,提升效率和鲁棒性。 distillation large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
15 Multimodal Federated Learning With Missing Modalities through Feature Imputation Network 提出基于特征补全网络的多模态联邦学习方法,解决模态缺失问题。 multimodal
16 BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models BASE-Q:通过偏差校正和非对称缩放增强LLM的旋转量化 large language model
17 HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models 提出HoPE:一种混合位置编码,提升长上下文视觉语言模型性能 large language model multimodal
18 LLM Web Dynamics: Tracing Model Collapse in a Network of LLMs 提出LLM Web Dynamics框架,用于在LLM网络中追踪模型坍塌现象 large language model
19 Towards Fully FP8 GEMM LLM Training at Scale 提出全FP8 GEMM LLM训练架构,提升大规模训练吞吐并保持精度 large language model
20 The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology 利用持续同调分析LLM隐空间,揭示对抗攻击的影响模式 large language model
21 Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents 提出无训练框架以模拟不同认知水平学生的学习行为 large language model
22 MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees 提出MESS+以优化LLM请求路由并确保服务质量 large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
23 ETS: Open Vocabulary Electroencephalography-To-Text Decoding and Sentiment Classification ETS:结合脑电与眼动数据的开放词汇脑电文本解码与情感分类框架 open-vocabulary open vocabulary

⬅️ 返回 cs.LG 首页 · 🏠 返回主页