cs.AI(2025-10-14)

📊 共 44 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (30 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (10 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (30 篇)

#题目一句话要点标签🔗
1 From Literal to Liberal: A Meta-Prompting Framework for Eliciting Human-Aligned Exception Handling in Large Language Models 提出RID框架,通过元提示提升LLM在异常处理中与人类意图对齐的能力 large language model instruction following chain-of-thought
2 Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey 综述Meta LLaMA模型演进及参数高效微调方法,为LLM研究者提供一站式资源 large language model foundation model multimodal
3 MatSciBench: Benchmarking the Reasoning Ability of Large Language Models in Materials Science MatSciBench:构建材料科学领域LLM推理能力评估基准 large language model multimodal chain-of-thought
4 GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents GenCellAgent:基于大语言模型Agent的通用、免训练细胞图像分割 large language model
5 From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model 利用大型语言模型从事故叙述中预测和解释驾驶员危险行为 large language model
6 Developing and Validating the Arabic Version of the Attitudes Toward Large Language Models Scale 开发并验证阿拉伯语版大语言模型态度量表,填补非西方文化背景下LLM认知研究空白。 large language model
7 Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification? 提出NL2Contract,利用大语言模型推断形式化契约以提升软件自动验证效果 large language model
8 A Survey of Vibe Coding with Large Language Models 对基于大语言模型的“Vibe Coding”范式进行全面综述,揭示其挑战与机遇。 large language model
9 Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models 评估大语言模型在随机性任务中的随机质量与熵值 large language model
10 HiCoTraj:Zero-Shot Demographic Reasoning via Hierarchical Chain-of-Thought Prompting from Trajectory HiCoTraj:利用轨迹分层思维链提示实现零样本人口统计推理 chain-of-thought
11 Benefits and Limitations of Communication in Multi-Agent Reasoning 提出多智能体推理理论框架,分析通信对解决复杂任务的益处与局限 large language model chain-of-thought
12 Toward Reasoning-Centric Time-Series Analysis 提出以推理为中心的时间序列分析方法,利用LLM提升复杂场景下的可解释性 large language model multimodal
13 Artificial Intelligence Virtual Cells: From Measurements to Decisions across Modality, Scale, Dynamics, and Evaluation 提出基于Cell-State Latent的AI虚拟细胞框架,提升跨模态、尺度和干预的细胞状态建模能力。 foundation model multimodal
14 RAG-Anything: All-in-One RAG Framework 提出RAG-Anything统一框架,实现跨模态知识的全面检索与增强生成。 large language model multimodal
15 EmboMatrix: A Scalable Training-Ground for Embodied Decision-Making 提出EmboMatrix:一个可扩展的具身决策训练平台,提升LLM的物理世界理解能力。 large language model
16 AI Agents as Universal Task Solvers 将AI Agent视为通用任务求解器,关注时间在学习推理中的关键作用 chain-of-thought
17 Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments Deliberate Lab:用于实时人机社会实验的开源平台,支持大规模LLM智能体。 large language model
18 Development and Benchmarking of a Blended Human-AI Qualitative Research Assistant 开发并评测了混合人机定性研究助手Muse,提升定性研究效率与一致性。 large language model
19 SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents SENTINEL:用于评估LLM具身智能体安全性的多层次形式化框架 large language model
20 InferA: A Smart Assistant for Cosmological Ensemble Data 提出InferA,利用多智能体系统辅助分析大规模宇宙学模拟数据。 large language model
21 KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems KVCOMM:面向LLM多智能体系统的高效在线跨上下文KV缓存通信 large language model
22 Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics Ax-Prover:基于深度推理Agent的数学与量子物理定理证明框架 large language model
23 Multi-Agent Debate for LLM Judges with Adaptive Stability Detection 提出基于多智能体辩论的LLM评判框架,提升评判准确性和效率 large language model
24 Adaptive Generation of Bias-Eliciting Questions for LLMs 提出自适应偏差诱导问题生成框架CAB,用于评估大型语言模型中的偏见。 large language model
25 MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics 提出MTOS框架,利用LLM模拟多主题意见演化,探索回音室效应 large language model
26 (R)evolution of Programming: Vibe Coding as a Post-Coding Paradigm 探索Vibe Coding:一种基于情感驱动的后编程范式,重塑开发者与AI的交互模式 large language model
27 PromptLocate: Localizing Prompt Injection Attacks PromptLocate:首个用于定位提示注入攻击的方法 large language model
28 GOAT: A Training Framework for Goal-Oriented Agent with Tools GOAT:一种用于训练具备工具使用能力的面向目标Agent的框架 large language model
29 ThinkPilot: Steering Reasoning Models via Automated Think-prefixes Optimization ThinkPilot:通过自动优化Think-prefixes来引导推理模型 instruction following
30 Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response 提出GAL框架,赋予LLM地理空间感知能力,用于野火响应中的情境推理。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
31 Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing 提出基于表征编辑的精确属性强度控制方法,实现大语言模型生成内容属性的精细化调控。 distillation large language model
32 $\mathbf{T^3}$: Reducing Belief Deviation in Reinforcement Learning for Active Reasoning 提出T^3方法,通过减少信念偏差提升LLM在主动推理强化学习中的性能 reinforcement learning large language model
33 Biased-Attention Guided Risk Prediction for Safe Decision-Making at Unsignalized Intersections 提出基于偏置注意力的风险预测方法,提升无人驾驶车辆在无信号交叉口的安全决策能力 reinforcement learning deep reinforcement learning DRL
34 One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration OneLife框架:从无引导探索中推断随机环境的符号世界模型 world model
35 DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping DeepPlanner:通过优势塑造提升深度研究Agent的规划能力 reinforcement learning large language model
36 From Delegates to Trustees: How Optimizing for Long-Term Interests Shapes Bias and Alignment in LLM 探索LLM中委托代理与受托代理的权衡,优化长期利益以塑造偏见与一致性 behavior cloning large language model
37 Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks 提出MemAct,通过强化学习自主管理LLM上下文,提升长程任务性能。 reinforcement learning large language model
38 Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication 提出人机闭环带宽估计方法,优化实时视频通信中的用户体验 reinforcement learning offline RL offline reinforcement learning
39 PromptFlow: Training Prompts Like Neural Networks PromptFlow:一种基于TensorFlow的模块化Prompt训练框架,提升LLM在特定领域的适应性。 reinforcement learning large language model
40 Repairing Reward Functions with Feedback to Mitigate Reward Hacking 提出基于偏好的奖励修复方法,缓解强化学习中的奖励函数漏洞 reinforcement learning preference learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
41 ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning ERA:通过具身先验学习和在线强化学习将视觉语言模型转化为具身智能体 manipulation reinforcement learning reward shaping
42 MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents 提出MCP安全基准(MSB),系统评估LLM Agent中模型上下文协议(MCP)面临的安全风险。 manipulation large language model instruction following

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
43 O-Forge: An LLM + Computer Algebra Framework for Asymptotic Analysis O-Forge:结合LLM与计算机代数系统,解决渐近分析难题 IMoS large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
44 BeSTAD: Behavior-Aware Spatio-Temporal Anomaly Detection for Human Mobility Data BeSTAD:行为感知的时空异常检测,用于人群移动数据分析 spatiotemporal

⬅️ 返回 cs.AI 首页 · 🏠 返回主页