cs.LG(2025-04-03)

📊 共 20 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (15 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
1 SCMPPI: Supervised Contrastive Multimodal Framework for Predicting Protein-Protein Interactions SCMPPI:一种用于预测蛋白质-蛋白质相互作用的监督对比多模态框架 contrastive learning multimodal
2 GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning 提出GPG:一种用于模型推理的简单而强大的强化学习基线方法 reinforcement learning large language model multimodal
3 Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning 提出RG-VLM,利用视觉-语言大模型为离线强化学习自动生成奖励 reinforcement learning offline reinforcement learning RLHF
4 Deep Reinforcement Learning via Object-Centric Attention 提出基于掩码的目标中心注意力机制OCCAM,提升深度强化学习泛化能力。 reinforcement learning deep reinforcement learning
5 Anomaly Detection in Time Series Data Using Reinforcement Learning, Variational Autoencoder, and Active Learning 提出基于强化学习、VAE和主动学习的时间序列异常检测方法,解决传统方法参数调优难和泛化性弱的问题。 reinforcement learning deep reinforcement learning DRL
6 Handover and SINR-Aware Path Optimization in 5G-UAV mmWave Communication using DRL 提出基于AC-DRL的5G-UAV毫米波通信路径优化方法,提升SINR并减少切换。 reinforcement learning deep reinforcement learning DRL
7 Adapting World Models with Latent-State Dynamics Residuals ReDRAW:利用隐状态动态残差自适应世界模型,解决模拟到真实环境的迁移问题 reinforcement learning world model
8 Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme 提出透明的VLM强化学习框架,并构建全面的评估体系 reinforcement learning large language model
9 Safety Modulation: Enhancing Safety in Reinforcement Learning through Cost-Modulated Rewards 提出基于代价调节奖励的安全策略优化算法,提升强化学习安全性 reinforcement learning
10 Improving log-based anomaly detection through learned adaptive filter 提出基于深度强化学习的自适应过滤器,提升日志异常检测性能。 reinforcement learning deep reinforcement learning DRL
11 Integrating Human Knowledge Through Action Masking in Reinforcement Learning for Operations Research 提出基于动作掩码的强化学习方法,融合人类知识解决运筹优化问题。 reinforcement learning
12 Solving the Paint Shop Problem with Flexible Management of Multi-Lane Buffers Using Reinforcement Learning and Action Masking 提出基于强化学习和动作掩码的多通道缓冲柔性管理方法,解决喷涂车间问题。 reinforcement learning
13 Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets 提出一种基于分层策略梯度强化学习的非凝聚目标多智能体牧羊控制方法 reinforcement learning
14 Reinforcement Learning for Solving the Pricing Problem in Column Generation: Applications to Vehicle Routing 提出基于强化学习的列生成定价问题求解方法,应用于车辆路径问题 reinforcement learning
15 Low-cost Embedded Breathing Rate Determination Using 802.15.4z IR-UWB Hardware for Remote Healthcare 提出基于IR-UWB和CNN的低成本嵌入式呼吸率检测方案,用于远程医疗。 MAE PULSE

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
16 FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training FAST:基于联邦学习和基础模型的高效通信主动学习框架 foundation model
17 Reinforcement Fine-Tuning for Materials Design 提出CrystalFormer-RL,利用强化微调提升材料生成模型的设计能力 large language model instruction following
18 Efficient Model Editing with Task-Localized Sparse Fine-tuning 提出TaLoS,通过任务局部稀疏微调实现高效模型编辑,提升任务组合性能。 foundation model
19 Prompt Optimization with Logged Bandit Data 提出基于核函数的离线策略梯度方法,利用用户反馈优化LLM提示,提升个性化语句生成效果 large language model
20 ZClip: Adaptive Spike Mitigation for LLM Pre-Training 提出ZClip自适应梯度裁剪算法,解决LLM预训练中的梯度爆炸问题 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页