cs.AI（2025-11-28）

📊 共 28 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (21 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (6) 支柱五：交互与反应 (Interaction & Reaction) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (21 篇)

#	题目	一句话要点	标签	🔗
1	OctoMed: Data Recipes for State-of-the-Art Multimodal Medical Reasoning	OctoMed：通过数据配方实现医学多模态推理的最优性能	large language model multimodal
2	TIM-PRM: Verifying multimodal reasoning with Tool-Integrated PRM	提出TIM-PRM，通过工具集成主动验证多模态推理，解决幻觉和逻辑不一致问题。	large language model multimodal
3	Chunking Strategies for Multimodal AI Systems	综述多模态AI系统中数据分块策略，为高效多模态系统设计提供技术基础。	multimodal
4	Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study	微调大型语言模型，用于尼日利亚皮钦语的自动抑郁症筛查：GENSCORE先导研究	large language model
5	Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?	研究训练激励如何影响思维链的可监控性，并提出新的监控能力评估方法。	chain-of-thought	✅
6	Reasoning in Action: MCTS-Driven Knowledge Retrieval for Large Language Models	提出基于MCTS的知识检索方法，提升LLM在对话中的推理能力。	large language model
7	AgriCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture	提出AgriCoT：农业领域视觉-语言模型推理能力评测基准	chain-of-thought	✅
8	Asm2SrcEval: Evaluating Large Language Models for Assembly-to-Source Code Translation	Asm2SrcEval：首个大规模汇编到源代码翻译的LLM评测基准	large language model
9	Generating Verifiable Chain of Thoughts from Exection-Traces	提出基于执行轨迹的可验证思维链生成方法，提升代码推理能力。	chain-of-thought
10	SimClinician: A Multimodal Simulation Testbed for Reliable Psychologist AI Collaboration in Mental Health Diagnosis	SimClinician：用于心理健康诊断中可靠的心理学家-AI协作的多模态仿真测试平台	multimodal
11	Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things	提出FEIBN框架，通过策略相似感知的联邦学习提升工业物联网意图网络的效率。	large language model multimodal
12	LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents	LegalWebAgent：利用LLM驱动的Web Agent赋能司法服务	large language model multimodal
13	Serving Heterogeneous LoRA Adapters in Distributed LLM Inference Systems	LoRAServe：一种工作负载感知的LoRA适配器动态部署与路由框架，解决异构LoRA服务中的性能倾斜问题。	large language model
14	CodeFlowLM: Incremental Just-In-Time Defect Prediction with Pretrained Language Models and Exploratory Insights into Defect Localization	CodeFlowLM：利用预训练语言模型进行增量式即时缺陷预测	large language model
15	Writing in Symbiosis: Mapping Human Creative Agency in the AI Era	通过分析人类写作风格演变，揭示人机共生时代下的创作模式	large language model
16	Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities	评估大型语言模型在真实和人工漏洞的单样本补丁修复能力	large language model
17	Retrieval-Augmented Few-Shot Prompting Versus Fine-Tuning for Code Vulnerability Detection	提出检索增强的少样本提示方法，用于代码漏洞检测，优于微调模型。	large language model
18	Autonomous QA Agent: A Retrieval-Augmented Framework for Reliable Selenium Script Generation	提出Autonomous QA Agent，利用RAG提升Selenium脚本生成的可靠性	large language model
19	MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents	MindPower：赋能VLM具身智能体进行心理理论推理	multimodal
20	AgentShield: Make MAS more secure and efficient	AgentShield：一种高效安全的分布式框架，用于保护基于LLM的多智能体系统	large language model
21	InsightEval: An Expert-Curated Benchmark for Assessing Insight Discovery in LLM-Driven Data Agents	InsightEval：一个专家构建的基准，用于评估LLM驱动的数据Agent中的洞察发现能力	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

#	题目	一句话要点	标签
22	Optimizing Information Asset Investment Strategies in the Exploratory Phase of the Oil and Gas Industry: A Reinforcement Learning Approach	提出基于多智能体深度强化学习的油气勘探信息资产投资优化策略	reinforcement learning deep reinforcement learning DRL
23	Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction	提出WMAct，通过高效交互提升LLM在复杂环境中的世界模型推理能力	world model large language model
24	Evolutionary Discovery of Heuristic Policies for Traffic Signal Control	提出Temporal Policy Evolution for Traffic (TPET)，利用LLM进化交通信号控制策略	reinforcement learning deep reinforcement learning DRL
25	Towards Continuous Intelligence Growth: Self-Training, Continual Learning, and Dual-Scale Memory in SuperIntelliAgent	SuperIntelliAgent：通过自训练、持续学习和双尺度记忆实现智能的持续增长	DPO direct preference optimization large language model
26	Peer-to-Peer Energy Trading in Dairy Farms using Multi-Agent Reinforcement Learning	提出基于多智能体强化学习的P2P能源交易方法，优化乳品农场的能源管理。	reinforcement learning PPO
27	Distillation-based Scenario-Adaptive Mixture-of-Experts for the Matching Stage of Multi-scenario Recommendation	提出基于蒸馏的场景自适应混合专家模型DSMOE，提升多场景推荐匹配效果。	distillation

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
28	One-Shot Secure Aggregation: A Hybrid Cryptographic Protocol for Private Federated Learning in IoT	提出Hyb-Agg协议，通过混合加密技术实现物联网中高效安全的联邦学习。	OMOMO

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2025-11-28）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (21 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理