cs.AI(2026-02-07)
📊 共 28 篇论文 | 🔗 6 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (17 🔗4)
支柱二:RL算法与架构 (RL & Architecture) (10 🔗2)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (17 篇)
🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 18 | SleepMaMi: A Universal Sleep Foundation Model for Integrating Macro- and Micro-structures | 提出SleepMaMi睡眠基础模型,整合宏观睡眠结构与微观信号特征,提升睡眠分析通用性。 | masked autoencoder MAE contrastive learning | ||
| 19 | Joint Reward Modeling: Internalizing Chain-of-Thought for Efficient Visual Reward Models | 提出联合奖励建模(JRM),提升视觉奖励模型在图像编辑等复杂任务中的效率和语义理解能力。 | reinforcement learning preference learning chain-of-thought | ||
| 20 | Debugging code world models | 研究代码世界模型的错误根源,提出改进监督和状态表示的建议。 | world model chain-of-thought | ||
| 21 | High Fidelity Textual User Representation over Heterogeneous Sources via Reinforcement Learning | 提出基于强化学习的文本用户表示方法,解决异构数据源融合与LLM兼容问题 | reinforcement learning large language model | ||
| 22 | Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model | SecCoderX:基于在线强化学习和漏洞奖励模型的安全代码生成框架 | reinforcement learning large language model | ✅ | |
| 23 | RAPiD: Real-time Deterministic Trajectory Planning via Diffusion Behavior Priors for Safe and Efficient Autonomous Driving | RAPiD:基于扩散行为先验的实时确定性轨迹规划,保障自动驾驶安全高效 | policy learning imitation learning multimodal | ✅ | |
| 24 | VERIFY-RL: Verifiable Recursive Decomposition for Reinforcement Learning in Mathematical Reasoning | VERIFY-RL:基于可验证递归分解的强化学习方法,提升数学推理能力 | reinforcement learning curriculum learning | ||
| 25 | Semantic Search At LinkedIn | LinkedIn提出基于LLM的语义搜索框架,显著提升AI职位和人才搜索效率。 | distillation large language model | ||
| 26 | EventCast: Hybrid Demand Forecasting in E-Commerce with LLM-Based Event Knowledge | EventCast:利用LLM事件知识增强电商混合需求预测 | MAE large language model | ||
| 27 | Adaptive Scaffolding for Cognitive Engagement in an Intelligent Tutoring System | 提出自适应脚手架,通过动态选择教学示例提升智能辅导系统中学生的认知参与度。 | reinforcement learning deep reinforcement learning DRL |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 28 | Agent-Fence: Mapping Security Vulnerabilities Across Deep Research Agents | 提出AgentFence以评估深度代理的安全漏洞 | manipulation large language model |