cs.LG(2025-03-26)

📊 共 20 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (6) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Reasoning Beyond Limits: Advances and Open Problems for LLMs 综述LLM推理能力进展与挑战,聚焦多语言、长文本及无监督推理。 reinforcement learning Mamba SSM
2 Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation MORAL:基于对抗数据增强的模型离线强化学习,提升策略鲁棒性 reinforcement learning policy learning offline RL
3 Robust Deep Reinforcement Learning in Robotics via Adaptive Gradient-Masked Adversarial Attacks 提出自适应梯度掩蔽对抗攻击以增强机器人深度强化学习的鲁棒性 reinforcement learning deep reinforcement learning DRL
4 State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning 提出STAR:一种状态感知扰动优化方法,提升DRL在对抗环境下的鲁棒性 reinforcement learning deep reinforcement learning DRL
5 Zero-Shot LLMs in Human-in-the-Loop RL: Replacing Human Feedback for Reward Shaping 提出LLM-HFBF框架,利用零样本LLM进行强化学习奖励塑造,并纠正人类反馈偏差。 reinforcement learning reward shaping large language model
6 World Model Agents with Change-Based Intrinsic Motivation 探索性奖励驱动的世界模型智能体,提升稀疏奖励环境下的学习效果 reinforcement learning world model dreamer
7 Innovative LSGTime Model for Crime Spatiotemporal Prediction Based on MindSpore Framework 提出基于MindSpore框架的LSGTime模型,用于犯罪时空预测。 MAE spatiotemporal
8 Reinforcement Learning for Efficient Toxicity Detection in Competitive Online Video Games 提出基于强化学习的上下文Bandit算法,高效检测在线游戏中的恶意行为 reinforcement learning
9 Cyborg Data: Merging Human with AI Generated Training Data 提出Cyborg Data:融合人工与AI生成数据,提升自动评分系统效率 distillation large language model
10 Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems Harmonia:一种基于多智能体强化学习的混合存储系统数据放置与迁移方法 reinforcement learning
11 Offline Action-Free Learning of Ex-BMDPs by Comparing Diverse Datasets CRAFT:通过比较不同数据集,离线学习Ex-BMDPs环境下的有效表征 policy learning representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
12 MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model 提出MoRE-LLM,结合规则专家和LLM,提升AI系统的可信性和可解释性。 large language model
13 Assessing Generative Models for Structured Data 提出评估框架,揭示LLM生成表格数据在列间依赖关系上的不足。 large language model
14 A Theoretical Framework for Prompt Engineering: Approximating Smooth Functions with Transformer Prompts 提出Transformer Prompt理论框架,证明其可近似平滑函数并解释工程技巧 large language model
15 Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation Adrenaline:通过注意力解耦提升LLM Serving的资源利用率和吞吐量 large language model
16 TeleLoRA: Teleporting Model-Specific Alignment Across LLMs TeleLoRA:通过迁移模型特定对齐数据,实现LLM间零样本木马缓解 large language model
17 Maya: Optimizing Deep Learning Training Workloads using GPU Runtime Emulation Maya:利用GPU运行时模拟优化深度学习训练工作负载 foundation model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
18 Offline Reinforcement Learning with Discrete Diffusion Skills 提出基于离散扩散技能的离线强化学习方法,提升长时任务性能。 locomotion reinforcement learning offline RL
19 Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning 提出不确定性感知的模型预测控制,提升强化学习在复杂任务中的样本效率。 manipulation reinforcement learning policy learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
20 PowerGNN: A Topology-Aware Graph Neural Network for Electricity Grids PowerGNN:一种拓扑感知的图神经网络,用于电力系统状态预测。 penetration

⬅️ 返回 cs.LG 首页 · 🏠 返回主页