cs.AI(2025-11-28)

📊 共 28 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (21 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (21 篇)

#题目一句话要点标签🔗
1 OctoMed: Data Recipes for State-of-the-Art Multimodal Medical Reasoning OctoMed:通过数据配方实现医学多模态推理的最优性能 large language model multimodal
2 TIM-PRM: Verifying multimodal reasoning with Tool-Integrated PRM 提出TIM-PRM,通过工具集成主动验证多模态推理,解决幻觉和逻辑不一致问题。 large language model multimodal
3 Chunking Strategies for Multimodal AI Systems 综述多模态AI系统中数据分块策略,为高效多模态系统设计提供技术基础。 multimodal
4 Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study 微调大型语言模型,用于尼日利亚皮钦语的自动抑郁症筛查:GENSCORE先导研究 large language model
5 Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability? 研究训练激励如何影响思维链的可监控性,并提出新的监控能力评估方法。 chain-of-thought
6 Reasoning in Action: MCTS-Driven Knowledge Retrieval for Large Language Models 提出基于MCTS的知识检索方法,提升LLM在对话中的推理能力。 large language model
7 AgriCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture 提出AgriCoT:农业领域视觉-语言模型推理能力评测基准 chain-of-thought
8 Asm2SrcEval: Evaluating Large Language Models for Assembly-to-Source Code Translation Asm2SrcEval:首个大规模汇编到源代码翻译的LLM评测基准 large language model
9 Generating Verifiable Chain of Thoughts from Exection-Traces 提出基于执行轨迹的可验证思维链生成方法,提升代码推理能力。 chain-of-thought
10 SimClinician: A Multimodal Simulation Testbed for Reliable Psychologist AI Collaboration in Mental Health Diagnosis SimClinician:用于心理健康诊断中可靠的心理学家-AI协作的多模态仿真测试平台 multimodal
11 Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things 提出FEIBN框架,通过策略相似感知的联邦学习提升工业物联网意图网络的效率。 large language model multimodal
12 LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents LegalWebAgent:利用LLM驱动的Web Agent赋能司法服务 large language model multimodal
13 Serving Heterogeneous LoRA Adapters in Distributed LLM Inference Systems LoRAServe:一种工作负载感知的LoRA适配器动态部署与路由框架,解决异构LoRA服务中的性能倾斜问题。 large language model
14 CodeFlowLM: Incremental Just-In-Time Defect Prediction with Pretrained Language Models and Exploratory Insights into Defect Localization CodeFlowLM:利用预训练语言模型进行增量式即时缺陷预测 large language model
15 Writing in Symbiosis: Mapping Human Creative Agency in the AI Era 通过分析人类写作风格演变,揭示人机共生时代下的创作模式 large language model
16 Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities 评估大型语言模型在真实和人工漏洞的单样本补丁修复能力 large language model
17 Retrieval-Augmented Few-Shot Prompting Versus Fine-Tuning for Code Vulnerability Detection 提出检索增强的少样本提示方法,用于代码漏洞检测,优于微调模型。 large language model
18 Autonomous QA Agent: A Retrieval-Augmented Framework for Reliable Selenium Script Generation 提出Autonomous QA Agent,利用RAG提升Selenium脚本生成的可靠性 large language model
19 MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents MindPower:赋能VLM具身智能体进行心理理论推理 multimodal
20 AgentShield: Make MAS more secure and efficient AgentShield:一种高效安全的分布式框架,用于保护基于LLM的多智能体系统 large language model
21 InsightEval: An Expert-Curated Benchmark for Assessing Insight Discovery in LLM-Driven Data Agents InsightEval:一个专家构建的基准,用于评估LLM驱动的数据Agent中的洞察发现能力 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
22 Optimizing Information Asset Investment Strategies in the Exploratory Phase of the Oil and Gas Industry: A Reinforcement Learning Approach 提出基于多智能体深度强化学习的油气勘探信息资产投资优化策略 reinforcement learning deep reinforcement learning DRL
23 Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction 提出WMAct,通过高效交互提升LLM在复杂环境中的世界模型推理能力 world model large language model
24 Evolutionary Discovery of Heuristic Policies for Traffic Signal Control 提出Temporal Policy Evolution for Traffic (TPET),利用LLM进化交通信号控制策略 reinforcement learning deep reinforcement learning DRL
25 Towards Continuous Intelligence Growth: Self-Training, Continual Learning, and Dual-Scale Memory in SuperIntelliAgent SuperIntelliAgent:通过自训练、持续学习和双尺度记忆实现智能的持续增长 DPO direct preference optimization large language model
26 Peer-to-Peer Energy Trading in Dairy Farms using Multi-Agent Reinforcement Learning 提出基于多智能体强化学习的P2P能源交易方法,优化乳品农场的能源管理。 reinforcement learning PPO
27 Distillation-based Scenario-Adaptive Mixture-of-Experts for the Matching Stage of Multi-scenario Recommendation 提出基于蒸馏的场景自适应混合专家模型DSMOE,提升多场景推荐匹配效果。 distillation

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
28 One-Shot Secure Aggregation: A Hybrid Cryptographic Protocol for Private Federated Learning in IoT 提出Hyb-Agg协议,通过混合加密技术实现物联网中高效安全的联邦学习。 OMOMO

⬅️ 返回 cs.AI 首页 · 🏠 返回主页