cs.LG(2025-05-26)

📊 共 23 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO 提出细致理论分析以理解RLHF与DPO间的性能差距 reinforcement learning preference learning RLHF
2 Alignment of large language models with constrained learning 提出基于拉格朗日对偶的迭代方法以解决约束对齐问题 RLHF large language model
3 Learning a Pessimistic Reward Model in RLHF 提出PET方法以解决离线RLHF中的奖励黑客问题 reinforcement learning offline reinforcement learning RLHF
4 Rotary Masked Autoencoders are Versatile Learners 提出Rotary Masked Autoencoder以解决时间序列学习问题 representation learning masked autoencoder MAE
5 JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning 提出JEDI以解决模型基础强化学习中的人机性能不对称问题 reinforcement learning world model
6 The Limits of Preference Data for Post-Training 研究偏好数据对后训练优化的限制及其影响 reinforcement learning RLHF large language model
7 An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning 提出可解释的神经退行性痴呆诊断框架以提升诊断透明度 reinforcement learning distillation large language model
8 DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning 提出DISCOVER以解决稀疏奖励强化学习中的探索问题 reinforcement learning
9 Equivariant Representation Learning for Symmetry-Aware Inference with Guarantees 提出对称感知推理的等变表示学习框架 representation learning
10 Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning 提出强化学习框架以优化少步文本到多视图扩散模型 reinforcement learning
11 The challenge of hidden gifts in multi-agent reinforcement learning 提出解决多智能体强化学习中隐藏礼物问题的新方法 reinforcement learning
12 Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning 提出贝叶斯自适应强化学习方法以增强LLM的反思探索能力 reinforcement learning large language model
13 Characterizing Pattern Matching and Its Limits on Compositional Task Structures 提出模式匹配的形式化框架以解决组合任务中的泛化问题 Mamba chain-of-thought
14 ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining 提出ESLM以提高大语言模型预训练的效率与鲁棒性 distillation large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
15 Multimodal Federated Learning With Missing Modalities through Feature Imputation Network 提出轻量级特征翻译网络以解决多模态联邦学习中的缺失模态问题 multimodal
16 BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models 提出BASE-Q以解决大语言模型量化中的偏差与剪切误差问题 large language model
17 HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models 提出HoPE以解决长视频理解中的位置编码问题 large language model multimodal
18 LLM Web Dynamics: Tracing Model Collapse in a Network of LLMs 提出LLM Web Dynamics框架以解决模型崩溃问题 large language model
19 Towards Fully FP8 GEMM LLM Training at Scale 提出全FP8 GEMM LLM训练架构以提升大规模训练效率 large language model
20 The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology 利用持久同调分析LLM的对抗影响特征 large language model
21 Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents 提出基于LLM的框架以模拟不同认知水平学生的学习行为 large language model
22 MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees 提出MESS+以优化LLM请求路由并确保服务质量 large language model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
23 ETS: Open Vocabulary Electroencephalography-To-Text Decoding and Sentiment Classification 提出ETS框架以解决开放词汇脑电图到文本解码问题 open-vocabulary open vocabulary

⬅️ 返回 cs.LG 首页 · 🏠 返回主页