cs.AI(2026-03-18)

📊 共 17 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (9) 支柱九:具身大模型 (Embodied Foundation Models) (7) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
1 VLM2Rec: Resolving Modality Collapse in Vision-Language Model Embedders for Multimodal Sequential Recommendation VLM2Rec:解决视觉-语言模型多模态序列推荐中的模态崩溃问题 contrastive learning geometric consistency large language model
2 MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment 提出MALLES:基于多智能体LLM的经济沙盒,对齐消费者偏好 preference learning large language model multimodal
3 Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations CRAFT:利用隐层表征对比推理对齐,提升大型语言模型抗越狱攻击的鲁棒性 reinforcement learning representation learning
4 CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents CodeScout:利用强化学习和标准Unix终端实现高效代码搜索 reinforcement learning reward design
5 CRE-T1 Preview Technical Report: Beyond Contrastive Learning for Reasoning-Intensive Retrieval 提出Thought 1 (T1),通过动态推理生成提升推理密集型检索性能。 reinforcement learning contrastive learning
6 Sensi: Learn One Thing at a Time -- Curriculum-Based Test-Time Learning for LLM Game Agents Sensi:面向LLM游戏Agent的课程学习型测试时学习方法 curriculum learning large language model
7 A Progressive Visual-Logic-Aligned Framework for Ride-Hailing Adjudication 提出RideJudge框架,解决网约车事故责任判定的透明性和准确性问题 reinforcement learning multimodal
8 From Digital Twins to World Models:Opportunities, Challenges, and Applications for Mobile Edge General Intelligence 探索数字孪生向世界模型的演进,赋能移动边缘通用智能 world model
9 InfoDensity: Rewarding Information-Dense Traces for Efficient Reasoning InfoDensity:通过奖励信息密集型推理轨迹提升LLM效率 reinforcement learning large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
10 FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair FailureMem:面向自主软件修复的故障感知多模态框架 multimodal visual grounding
11 Towards Safer Large Reasoning Models by Promoting Safety Decision-Making before Chain-of-Thought Generation 提出安全对齐方法,提升思维链大语言模型在推理时的安全性 chain-of-thought
12 Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients 利用EHR集成的LLM工具辅助外科患者分诊,提高手术共管效率 large language model
13 Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs 针对生成式AI Agent,提出基于差分隐私的隐私泄露分析与最优权衡方法 large language model
14 Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory 提出知识对象(KOs)作为LLM持久记忆,解决上下文记忆的容量、压缩和目标漂移问题 large language model
15 VeriGrey: Greybox Agent Validation VeriGrey:一种灰盒方法,用于验证LLM Agent并发现安全风险。 large language model
16 A Contextual Help Browser Extension to Assist Digital Illiterate Internet Users 提出一种上下文帮助浏览器扩展,辅助数字素养不足的用户理解技术术语。 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
17 Physics-informed offline reinforcement learning eliminates catastrophic fuel waste in maritime routing PIER:基于物理信息的离线强化学习消除航运中灾难性燃油浪费 trajectory optimization reinforcement learning offline reinforcement learning

⬅️ 返回 cs.AI 首页 · 🏠 返回主页