cs.LG(2025-12-16)

📊 共 13 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 Understanding the Gain from Data Filtering in Multimodal Contrastive Learning 提出教师模型过滤以提升多模态对比学习效果 representation learning contrastive learning multimodal
2 Joint Multimodal Contrastive Learning for Robust Spoken Term Detection and Keyword Spotting 提出联合多模态对比学习框架,提升语音检索任务的鲁棒性与效率 contrastive learning multimodal
3 A First-Order Logic-Based Alternative to Reward Models in RLHF 提出基于逻辑相似性的S-GRPO,替代RLHF中的奖励模型,提升对齐效果。 reinforcement learning PPO preference learning
4 EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment EXAONE Path 2.5:多组学对齐的病理学基础模型,用于更全面的肿瘤生物学理解 contrastive learning foundation model multimodal
5 Understanding and Improving Hyperbolic Deep Reinforcement Learning 提出Hyper++,解决双曲深度强化学习中梯度不稳定和训练困难问题 reinforcement learning deep reinforcement learning PPO
6 Model-Based Reinforcement Learning in Discrete-Action Non-Markovian Reward Decision Processes 提出QR-MAX算法,解决离散动作非马尔可夫奖励决策过程中的模型学习与策略优化问题 reinforcement learning model-based RL
7 ParaFormer: A Generalized PageRank Graph Transformer for Graph Representation Learning 提出ParaFormer,一种基于PageRank增强的图Transformer,缓解图表示学习中的过平滑问题。 representation learning
8 Kinetic-Mamba: Mamba-Assisted Predictions of Stiff Chemical Kinetics Kinetic-Mamba:利用Mamba架构预测刚性化学动力学,提升燃烧模拟精度。 Mamba
9 Explainable Preference Learning: a Decision Tree-based Surrogate Model for Preferential Bayesian Optimization 提出基于决策树的可解释偏好学习模型以优化偏好贝叶斯优化 preference learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
10 Cornserve: Efficiently Serving Any-to-Any Multimodal Models Cornserve:高效服务任意到任意多模态模型的在线服务系统 large language model multimodal
11 Estimating problem difficulty without ground truth using Large Language Model comparisons 提出LLM compare以解决无基准真值问题的难度估计 large language model
12 RePo: Language Models with Context Re-Positioning 提出RePo:通过上下文重定位增强语言模型处理噪声、结构化数据和长文本能力 large language model
13 FLAME: Flow Enhanced Legendre Memory Models for General Time Series Forecasting FLAME:基于流增强勒让德记忆模型,用于通用时间序列预测 foundation model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页