cs.AI(2025-10-31)

📊 共 34 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (23 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (23 篇)

#题目一句话要点标签🔗
1 FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Error Correction FMint-SDE:基于误差校正的多模态基础模型加速随机微分方程数值模拟 foundation model multimodal
2 What a diff makes: automating code migration with large language models 利用大型语言模型和代码差异自动化代码迁移,提升软件兼容性。 large language model
3 Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models 提出BioRiskEval框架,评估开放生物大模型潜在的生物风险与数据过滤有效性。 foundation model
4 Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation 提出基于RAG的框架,增强LLM在网络安全领域的适应性和可靠性 large language model
5 ConnectomeBench: Can LLMs Proofread the Connectome? ConnectomeBench:评估LLM在神经连接体校对中的能力,探索AI辅助神经科学新途径 large language model multimodal
6 Validity Is What You Need Agentic AI应用落地关键在于有效性验证,而非过度依赖大型语言模型 large language model foundation model
7 CodeAlignBench: Assessing Code Generation Models on Developer-Preferred Code Adjustments CodeAlignBench:评估代码生成模型在开发者偏好代码调整上的性能 large language model instruction following
8 ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use 提出ToolScope框架,解决多模态LLM在长程视觉问答中工具利用的难题 large language model multimodal
9 LongCat-Flash-Omni Technical Report 美团提出LongCat-Flash-Omni,一个5600亿参数的实时音视频交互全模态开源模型 multimodal
10 Scalable Processing-Near-Memory for 1M-Token LLM Inference: CXL-Enabled KV-Cache Management Beyond GPU Limits 提出基于CXL的PNM架构,加速百万Token LLM推理,突破GPU显存限制 large language model
11 Advancing Cognitive Science with LLMs 利用大型语言模型(LLMs)促进认知科学的知识整合与理论形式化 large language model
12 Understanding Code Agent Behaviour: An Empirical Study of Success and Failure Trajectories 通过分析代码Agent轨迹理解其行为,揭示成功与失败模式 large language model
13 Simulating Misinformation Vulnerabilities With Agent Personas 利用Agent Persona模拟信息误导的脆弱性,评估不同群体对虚假信息的反应。 large language model
14 VeriMoA: A Mixture-of-Agents Framework for Spec-to-HDL Generation 提出VeriMoA框架以解决HDL生成中的噪声传播与推理空间限制问题 large language model
15 Mechanics of Learned Reasoning 1: TempoBench, A Benchmark for Interpretable Deconstruction of Reasoning System Performance TempoBench:用于可解释地解构推理系统性能的基准测试 large language model
16 GeoFM: Enhancing Geometric Reasoning of MLLMs via Synthetic Data Generation through Formal Language GeoFM:通过形式语言生成合成数据,提升多模态大语言模型几何推理能力 large language model
17 Thinking Like a Student: AI-Supported Reflective Planning in a Theory-Intensive Computer Science Course 利用LLM模拟学生视角,改进理论密集型计算机课程的教学设计 large language model
18 An In-depth Study of LLM Contributions to the Bin Packing Problem 深入研究LLM在装箱问题中的贡献:有效性与局限性分析 large language model
19 Inferring multiple helper Dafny assertions with LLMs 提出DAISY,利用LLM自动推断Dafny程序中缺失的多个辅助断言,提升形式化验证效率。 large language model
20 Vintage Code, Modern Judges: Meta-Validation in Low Data Regimes 提出SparseAlign框架,解决低数据环境下LaaJ的元验证问题 large language model
21 Fints: Efficient Inference-Time Personalization for LLMs with Fine-Grained Instance-Tailored Steering Fints:通过细粒度实例定制引导,实现LLM的高效推理时个性化 large language model
22 Glia: A Human-Inspired AI for Automated Systems Design and Optimization Glia:一种受人类启发的人工智能,用于自动化系统设计与优化 large language model
23 Expressive Range Characterization of Open Text-to-Audio Models 提出基于ERA的框架,用于评估开放文本到音频模型的表达范围。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
24 GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation GUI-Rise:提出一种融合结构化推理和历史总结的GUI导航框架,提升跨领域泛化能力。 reinforcement learning large language model multimodal
25 Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning 提出BEAT框架,通过对比触发学习实现对MLLM具身智能体的视觉后门攻击 preference learning large language model multimodal
26 LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever LLM2IR:简单无监督对比学习使长文本LLM成为卓越的检索器 contrastive learning large language model
27 InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research 提出InnovatorBench基准测试,评估AI Agent在LLM研究中的创新能力。 reward design large language model
28 DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains 提出DeepCompress以解决大规模推理模型的效率与准确性问题 reinforcement learning chain-of-thought
29 Closing the Expression Gap in LLM Instructions via Socratic Questioning 提出Nous,通过苏格拉底式提问弥合LLM指令中的表达鸿沟 reward design instruction following
30 Reinforcement Learning for Long-Horizon Unordered Tasks: From Boolean to Coupled Reward Machines 提出耦合奖励机CoRM,解决长时序无序任务中的强化学习问题 reinforcement learning
31 Machine learning-based cloud resource allocation algorithms: a comprehensive comparative review 综述性研究:基于机器学习的云资源分配算法,提升性能与效率。 reinforcement learning deep reinforcement learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
32 Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication 提出基于脑电和肌电信号的语音合成方法,实现开放词汇神经交流 open-vocabulary open vocabulary

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
33 CombiGraph-Vis: A Curated Multimodal Olympiad Benchmark for Discrete Mathematical Reasoning 提出基于智能体工作流的数学奥赛证明评分框架,提升评分一致性。 IMoS multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
34 Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training 提出Apollo框架,通过异步人机交互提升LLM Agent在长时任务中的训练效果 manipulation behavior cloning large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页