cs.AI(2026-04-28)
📊 共 31 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (20 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (9)
支柱一:机器人控制 (Robot Control) (2)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (20 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 21 | Three Models of RLHF Annotation: Extension, Evidence, and Authority | 提出RLHF标注的三种模型,优化人类反馈强化学习流程 | reinforcement learning RLHF large language model | ||
| 22 | How Can Reinforcement Learning Achieve Expert-level Placement? | 提出基于专家布局学习的强化学习方法,提升芯片布局质量 | reinforcement learning reward design | ||
| 23 | Semi-Markov Reinforcement Learning for City-Scale EV Ride-Hailing with Feasibility-Guaranteed Actions | 提出基于半马尔可夫强化学习的城市级电动汽车网约车控制方法,保证动作可行性。 | reinforcement learning SAC | ||
| 24 | Sample-efficient Neuro-symbolic Proximal Policy Optimization | 提出神经符号近端策略优化,提升DRL在稀疏奖励和长规划任务中的样本效率 | reinforcement learning deep reinforcement learning DRL | ||
| 25 | Improving Zero-Shot Offline RL via Behavioral Task Sampling | 提出基于行为任务采样的离线零样本强化学习方法,提升泛化性能。 | reinforcement learning offline RL | ||
| 26 | RADD: Retrieval-Augmented Discrete Diffusion for Multi-Modal Knowledge Graph Completion | 提出RADD框架,解耦检索与重排序,提升多模态知识图谱补全性能。 | distillation multimodal | ||
| 27 | JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR | JURY-RL:基于投票提议与形式化验证的无标签强化学习 | reinforcement learning large language model | ||
| 28 | Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control | 提出基于多动作缠结程序图的MATPG算法,用于连续控制多任务强化学习。 | reinforcement learning | ||
| 29 | Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective | 通过偏差-方差视角评估弱到强对齐中的风险,揭示强模型方差是欺骗性错误的早期预警信号。 | reinforcement learning RLHF |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 30 | Large language models eroding science understanding: an experimental study | 大型语言模型易受伪科学影响,损害科学认知 | manipulation large language model | ||
| 31 | PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification | 提出PHISHREV框架以解决网络钓鱼网站分类中的上下文推理问题 | manipulation |