cs.AI（2025-10-14）

📊 共 44 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (30 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (10 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱五：交互与反应 (Interaction & Reaction) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (30 篇)

#	题目	一句话要点	标签	🔗
1	From Literal to Liberal: A Meta-Prompting Framework for Eliciting Human-Aligned Exception Handling in Large Language Models	提出RID框架，通过元提示提升LLM在异常处理中与人类意图对齐的能力	large language model instruction following chain-of-thought
2	Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a survey	综述Meta LLaMA模型演进及参数高效微调方法，为LLM研究者提供一站式资源	large language model foundation model multimodal
3	MatSciBench: Benchmarking the Reasoning Ability of Large Language Models in Materials Science	MatSciBench：构建材料科学领域LLM推理能力评估基准	large language model multimodal chain-of-thought
4	GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents	GenCellAgent：基于大语言模型Agent的通用、免训练细胞图像分割	large language model
5	From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model	利用大型语言模型从事故叙述中预测和解释驾驶员危险行为	large language model
6	Developing and Validating the Arabic Version of the Attitudes Toward Large Language Models Scale	开发并验证阿拉伯语版大语言模型态度量表，填补非西方文化背景下LLM认知研究空白。	large language model
7	Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification?	提出NL2Contract，利用大语言模型推断形式化契约以提升软件自动验证效果	large language model
8	A Survey of Vibe Coding with Large Language Models	对基于大语言模型的“Vibe Coding”范式进行全面综述，揭示其挑战与机遇。	large language model
9	Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models	评估大语言模型在随机性任务中的随机质量与熵值	large language model
10	HiCoTraj:Zero-Shot Demographic Reasoning via Hierarchical Chain-of-Thought Prompting from Trajectory	HiCoTraj：利用轨迹分层思维链提示实现零样本人口统计推理	chain-of-thought
11	Benefits and Limitations of Communication in Multi-Agent Reasoning	提出多智能体推理理论框架，分析通信对解决复杂任务的益处与局限	large language model chain-of-thought
12	Toward Reasoning-Centric Time-Series Analysis	提出以推理为中心的时间序列分析方法，利用LLM提升复杂场景下的可解释性	large language model multimodal
13	Artificial Intelligence Virtual Cells: From Measurements to Decisions across Modality, Scale, Dynamics, and Evaluation	提出基于Cell-State Latent的AI虚拟细胞框架，提升跨模态、尺度和干预的细胞状态建模能力。	foundation model multimodal
14	RAG-Anything: All-in-One RAG Framework	提出RAG-Anything统一框架，实现跨模态知识的全面检索与增强生成。	large language model multimodal	✅
15	EmboMatrix: A Scalable Training-Ground for Embodied Decision-Making	提出EmboMatrix：一个可扩展的具身决策训练平台，提升LLM的物理世界理解能力。	large language model
16	AI Agents as Universal Task Solvers	将AI Agent视为通用任务求解器，关注时间在学习推理中的关键作用	chain-of-thought
17	Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments	Deliberate Lab：用于实时人机社会实验的开源平台，支持大规模LLM智能体。	large language model
18	Development and Benchmarking of a Blended Human-AI Qualitative Research Assistant	开发并评测了混合人机定性研究助手Muse，提升定性研究效率与一致性。	large language model
19	SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents	SENTINEL：用于评估LLM具身智能体安全性的多层次形式化框架	large language model
20	InferA: A Smart Assistant for Cosmological Ensemble Data	提出InferA，利用多智能体系统辅助分析大规模宇宙学模拟数据。	large language model
21	KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems	KVCOMM：面向LLM多智能体系统的高效在线跨上下文KV缓存通信	large language model
22	Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics	Ax-Prover：基于深度推理Agent的数学与量子物理定理证明框架	large language model
23	Multi-Agent Debate for LLM Judges with Adaptive Stability Detection	提出基于多智能体辩论的LLM评判框架，提升评判准确性和效率	large language model
24	Adaptive Generation of Bias-Eliciting Questions for LLMs	提出自适应偏差诱导问题生成框架CAB，用于评估大型语言模型中的偏见。	large language model
25	MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics	提出MTOS框架，利用LLM模拟多主题意见演化，探索回音室效应	large language model
26	(R)evolution of Programming: Vibe Coding as a Post-Coding Paradigm	探索Vibe Coding：一种基于情感驱动的后编程范式，重塑开发者与AI的交互模式	large language model
27	PromptLocate: Localizing Prompt Injection Attacks	PromptLocate：首个用于定位提示注入攻击的方法	large language model
28	GOAT: A Training Framework for Goal-Oriented Agent with Tools	GOAT：一种用于训练具备工具使用能力的面向目标Agent的框架	large language model
29	ThinkPilot: Steering Reasoning Models via Automated Think-prefixes Optimization	ThinkPilot：通过自动优化Think-prefixes来引导推理模型	instruction following	✅
30	Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response	提出GAL框架，赋予LLM地理空间感知能力，用于野火响应中的情境推理。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

#	题目	一句话要点	标签	🔗
31	Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing	提出基于表征编辑的精确属性强度控制方法，实现大语言模型生成内容属性的精细化调控。	distillation large language model	✅
32	$\mathbf{T^3}$: Reducing Belief Deviation in Reinforcement Learning for Active Reasoning	提出T^3方法，通过减少信念偏差提升LLM在主动推理强化学习中的性能	reinforcement learning large language model
33	Biased-Attention Guided Risk Prediction for Safe Decision-Making at Unsignalized Intersections	提出基于偏置注意力的风险预测方法，提升无人驾驶车辆在无信号交叉口的安全决策能力	reinforcement learning deep reinforcement learning DRL	✅
34	One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration	OneLife框架：从无引导探索中推断随机环境的符号世界模型	world model
35	DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping	DeepPlanner：通过优势塑造提升深度研究Agent的规划能力	reinforcement learning large language model
36	From Delegates to Trustees: How Optimizing for Long-Term Interests Shapes Bias and Alignment in LLM	探索LLM中委托代理与受托代理的权衡，优化长期利益以塑造偏见与一致性	behavior cloning large language model
37	Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks	提出MemAct，通过强化学习自主管理LLM上下文，提升长程任务性能。	reinforcement learning large language model
38	Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication	提出人机闭环带宽估计方法，优化实时视频通信中的用户体验	reinforcement learning offline RL offline reinforcement learning
39	PromptFlow: Training Prompts Like Neural Networks	PromptFlow：一种基于TensorFlow的模块化Prompt训练框架，提升LLM在特定领域的适应性。	reinforcement learning large language model
40	Repairing Reward Functions with Feedback to Mitigate Reward Hacking	提出基于偏好的奖励修复方法，缓解强化学习中的奖励函数漏洞	reinforcement learning preference learning

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
41	ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning	ERA：通过具身先验学习和在线强化学习将视觉语言模型转化为具身智能体	manipulation reinforcement learning reward shaping
42	MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents	提出MCP安全基准(MSB)，系统评估LLM Agent中模型上下文协议(MCP)面临的安全风险。	manipulation large language model instruction following

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
43	O-Forge: An LLM + Computer Algebra Framework for Asymptotic Analysis	O-Forge：结合LLM与计算机代数系统，解决渐近分析难题	IMoS large language model

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
44	BeSTAD: Behavior-Aware Spatio-Temporal Anomaly Detection for Human Mobility Data	BeSTAD：行为感知的时空异常检测，用于人群移动数据分析	spatiotemporal

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2025-10-14）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (30 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理