cs.AI（2025-10-31）

📊 共 34 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (23 🔗3) 支柱二：RL算法与架构 (RL & Architecture) (8 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱五：交互与反应 (Interaction & Reaction) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (23 篇)

#	题目	一句话要点	标签	🔗
1	FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Error Correction	FMint-SDE：基于误差校正的多模态基础模型加速随机微分方程数值模拟	foundation model multimodal
2	What a diff makes: automating code migration with large language models	利用大型语言模型和代码差异自动化代码迁移，提升软件兼容性。	large language model
3	Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models	提出BioRiskEval框架，评估开放生物大模型潜在的生物风险与数据过滤有效性。	foundation model
4	Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation	提出基于RAG的框架，增强LLM在网络安全领域的适应性和可靠性	large language model
5	ConnectomeBench: Can LLMs Proofread the Connectome?	ConnectomeBench：评估LLM在神经连接体校对中的能力，探索AI辅助神经科学新途径	large language model multimodal	✅
6	Validity Is What You Need	Agentic AI应用落地关键在于有效性验证，而非过度依赖大型语言模型	large language model foundation model
7	CodeAlignBench: Assessing Code Generation Models on Developer-Preferred Code Adjustments	CodeAlignBench：评估代码生成模型在开发者偏好代码调整上的性能	large language model instruction following
8	ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use	提出ToolScope框架，解决多模态LLM在长程视觉问答中工具利用的难题	large language model multimodal
9	LongCat-Flash-Omni Technical Report	美团提出LongCat-Flash-Omni，一个5600亿参数的实时音视频交互全模态开源模型	multimodal
10	Scalable Processing-Near-Memory for 1M-Token LLM Inference: CXL-Enabled KV-Cache Management Beyond GPU Limits	提出基于CXL的PNM架构，加速百万Token LLM推理，突破GPU显存限制	large language model
11	Advancing Cognitive Science with LLMs	利用大型语言模型（LLMs）促进认知科学的知识整合与理论形式化	large language model
12	Understanding Code Agent Behaviour: An Empirical Study of Success and Failure Trajectories	通过分析代码Agent轨迹理解其行为，揭示成功与失败模式	large language model
13	Simulating Misinformation Vulnerabilities With Agent Personas	利用Agent Persona模拟信息误导的脆弱性，评估不同群体对虚假信息的反应。	large language model
14	VeriMoA: A Mixture-of-Agents Framework for Spec-to-HDL Generation	提出VeriMoA框架以解决HDL生成中的噪声传播与推理空间限制问题	large language model
15	Mechanics of Learned Reasoning 1: TempoBench, A Benchmark for Interpretable Deconstruction of Reasoning System Performance	TempoBench：用于可解释地解构推理系统性能的基准测试	large language model	✅
16	GeoFM: Enhancing Geometric Reasoning of MLLMs via Synthetic Data Generation through Formal Language	GeoFM：通过形式语言生成合成数据，提升多模态大语言模型几何推理能力	large language model
17	Thinking Like a Student: AI-Supported Reflective Planning in a Theory-Intensive Computer Science Course	利用LLM模拟学生视角，改进理论密集型计算机课程的教学设计	large language model
18	An In-depth Study of LLM Contributions to the Bin Packing Problem	深入研究LLM在装箱问题中的贡献：有效性与局限性分析	large language model
19	Inferring multiple helper Dafny assertions with LLMs	提出DAISY，利用LLM自动推断Dafny程序中缺失的多个辅助断言，提升形式化验证效率。	large language model
20	Vintage Code, Modern Judges: Meta-Validation in Low Data Regimes	提出SparseAlign框架，解决低数据环境下LaaJ的元验证问题	large language model
21	Fints: Efficient Inference-Time Personalization for LLMs with Fine-Grained Instance-Tailored Steering	Fints：通过细粒度实例定制引导，实现LLM的高效推理时个性化	large language model	✅
22	Glia: A Human-Inspired AI for Automated Systems Design and Optimization	Glia：一种受人类启发的人工智能，用于自动化系统设计与优化	large language model
23	Expressive Range Characterization of Open Text-to-Audio Models	提出基于ERA的框架，用于评估开放文本到音频模型的表达范围。	multimodal

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗
24	GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation	GUI-Rise：提出一种融合结构化推理和历史总结的GUI导航框架，提升跨领域泛化能力。	reinforcement learning large language model multimodal	✅
25	Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning	提出BEAT框架，通过对比触发学习实现对MLLM具身智能体的视觉后门攻击	preference learning large language model multimodal
26	LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever	LLM2IR：简单无监督对比学习使长文本LLM成为卓越的检索器	contrastive learning large language model
27	InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research	提出InnovatorBench基准测试，评估AI Agent在LLM研究中的创新能力。	reward design large language model
28	DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains	提出DeepCompress以解决大规模推理模型的效率与准确性问题	reinforcement learning chain-of-thought
29	Closing the Expression Gap in LLM Instructions via Socratic Questioning	提出Nous，通过苏格拉底式提问弥合LLM指令中的表达鸿沟	reward design instruction following
30	Reinforcement Learning for Long-Horizon Unordered Tasks: From Boolean to Coupled Reward Machines	提出耦合奖励机CoRM，解决长时序无序任务中的强化学习问题	reinforcement learning
31	Machine learning-based cloud resource allocation algorithms: a comprehensive comparative review	综述性研究：基于机器学习的云资源分配算法，提升性能与效率。	reinforcement learning deep reinforcement learning

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
32	Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication	提出基于脑电和肌电信号的语音合成方法，实现开放词汇神经交流	open-vocabulary open vocabulary

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
33	CombiGraph-Vis: A Curated Multimodal Olympiad Benchmark for Discrete Mathematical Reasoning	提出基于智能体工作流的数学奥赛证明评分框架，提升评分一致性。	IMoS multimodal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
34	Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training	提出Apollo框架，通过异步人机交互提升LLM Agent在长时任务中的训练效果	manipulation behavior cloning large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2025-10-31）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (23 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理