cs.AI(2026-02-27)

📊 共 27 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume 提出UMPIRE框架,通过不一致性调整的语义体积量化多模态大语言模型的不确定性 large language model multimodal
2 Unlocking Cognitive Capabilities and Analyzing the Perception-Logic Trade-off 提出MERaLiON2-Omni以解决东南亚多模态感知与推理的挑战 large language model multimodal instruction following
3 Reasoning-Driven Multimodal LLM for Domain Generalization 提出RD-MLDG框架,利用多模态LLM的推理能力提升领域泛化性能 large language model multimodal
4 Ask don't tell: Reducing sycophancy in large language models 提出一种简单有效的输入干预方法,显著降低大语言模型中的谄媚现象 large language model
5 PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents PseudoAct:利用伪代码生成提升LLM Agent的规划能力和动作控制 large language model
6 MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs 提出MMKG-RDS框架,通过深度挖掘多模态知识图谱合成推理数据,提升领域模型推理能力。 multimodal
7 SleepLM: Natural-Language Intelligence for Human Sleep 提出SleepLM,通过自然语言智能实现人类睡眠的对齐、解读和交互。 foundation model multimodal
8 Artificial Agency Program: Curiosity, compression, and communication in agents 提出人工代理程序(AAP),通过好奇心驱动的智能体学习,构建现实嵌入、资源受限的AI系统。 multimodal
9 Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving 提出一种数据驱动的GPU优化方法,用于分布式LLM适配器服务,提升资源效率。 large language model
10 Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning 提出Hybrid-CASR方法,解决LLM在软件漏洞预测中灾难性遗忘问题。 large language model
11 LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics 提出LemmaBench,一个评估LLM数学能力的实时研究级基准 large language model
12 HotelQuEST: Balancing Quality and Efficiency in Agentic Search HotelQuEST:兼顾质量与效率的Agentic搜索评测基准 large language model
13 SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud 提出面向Device-RAN-Cloud异构环境的SLA感知分布式LLM推理方案。 embodied AI
14 ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference 提出ODAR-Expert,通过主动推理进行LLM推理的自适应路由,优化计算效率。 large language model
15 AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech AudioCapBench:一个用于快速评估跨声音、音乐和语音的音频字幕生成能力的基准。 multimodal
16 ReDON: Recurrent Diffractive Optical Neural Processor with Reconfigurable Self-Modulated Nonlinearity 提出ReDON:一种具有可重构自调制非线性的循环衍射光学神经网络处理器 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
17 EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models 提出EMO-R3框架,提升多模态大语言模型在视觉情感理解中的推理能力。 reinforcement learning large language model multimodal
18 Pessimistic Auxiliary Policy for Offline Reinforcement Learning 提出悲观辅助策略,解决离线强化学习中的过估计问题 reinforcement learning offline RL offline reinforcement learning
19 DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science DARE-bench:评估LLM在数据科学中建模和指令遵循的基准 reinforcement learning large language model instruction following
20 ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation 提出ProductResearch框架,通过多智能体合成轨迹蒸馏训练电商深度研究Agent distillation large language model
21 RF-Agent: Automated Reward Function Design via Language Agent Tree Search 提出RF-Agent,利用语言代理树搜索自动设计强化学习奖励函数 reward design large language model
22 Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem 提出基于强化学习的构造-合并-求解-适应算法RL-CMSA,解决最小-最大多旅行商问题。 reinforcement learning
23 Learning Flexible Job Shop Scheduling under Limited Buffers and Material Kitting Constraints 提出基于异构图网络的DRL方法,解决有限缓冲和物料配套约束下的柔性作业车间调度问题 reinforcement learning deep reinforcement learning DRL
24 Portfolio Reinforcement Learning with Scenario-Context Rollout 提出宏观条件情景上下文展开的强化学习方法,提升投资组合在市场剧变下的鲁棒性。 reinforcement learning
25 The Auton Agentic AI Framework Auton:用于自主Agent系统构建、执行和治理的通用AI框架 reinforcement learning large language model
26 Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing LACE-RL:基于强化学习的Serverless冷启动与碳排放平衡管理框架 reinforcement learning deep reinforcement learning
27 RUMAD: Reinforcement-Unifying Multi-Agent Debate RUMAD:提出基于强化学习的多智能体辩论框架,提升效率与泛化性 reinforcement learning PPO

⬅️ 返回 cs.AI 首页 · 🏠 返回主页