cs.AI(2025-10-13)

📊 共 26 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 Countermind: A Multi-Layered Security Architecture for Large Language Models Countermind:一种用于大型语言模型的多层安全架构,旨在防御提示注入等攻击。 large language model multimodal
2 Asking Clarifying Questions for Preference Elicitation With Large Language Models 提出基于扩散模型的澄清问题生成方法,提升LLM偏好获取能力 large language model
3 Beyond touch-based HMI: Control your machines in natural language by utilizing large language models and OPC UA 提出基于LLM和OPC UA的自然语言人机交互方法,提升工业控制便捷性 large language model
4 Automating Structural Engineering Workflows with Large Language Model Agents MASSE:基于LLM Agent的结构工程工作流自动化系统 large language model
5 Diffusion-Link: Diffusion Probabilistic Model for Bridging the Audio-Text Modality Gap 提出Diffusion-Link,通过扩散模型弥合音频-文本模态鸿沟,提升音频自动描述性能。 large language model multimodal
6 Analyzing and Internalizing Complex Policy Documents for LLM Agents 提出CAP-CPT,通过类别感知的持续预训练,提升LLM Agent在复杂策略文档中的推理能力。 large language model chain-of-thought
7 Improving AI Efficiency in Data Centres by Power Dynamic Response 提出动态电源响应方法,提升AI数据中心能效与可持续性 large language model foundation model
8 CTIArena: Benchmarking LLM Knowledge and Reasoning Across Heterogeneous Cyber Threat Intelligence CTIArena:构建知识增强型网络威胁情报LLM基准评测体系 large language model
9 Beyond Consensus: Mitigating the Agreeableness Bias in LLM Judge Evaluations 提出少数否决与回归模型,缓解LLM评判中的一致性偏差,提升代码评估精度。 large language model
10 ParaCook: On Time-Efficient Planning for Multi-Agent Systems ParaCook:面向多智能体系统的时间效率型规划基准 large language model
11 Zero Data Retention in LLM-based Enterprise AI Assistants: A Comparative Study of Market Leading Agentic AI Products 对比研究Salesforce和Microsoft的企业AI助手零数据保留策略 large language model
12 Audio-Maestro: Enhancing Large Audio-Language Models with Tool-Augmented Reasoning Audio-Maestro:工具增强推理提升大型音频语言模型性能 multimodal
13 Automated Skill Decomposition Meets Expert Ontologies: Bridging the Granularity Gap with LLMs 提出基于LLM的技能自动分解框架,弥合技能粒度与专家知识体系之间的差距 large language model
14 PADME: Procedure Aware DynaMic Execution PADME:提出程序感知动态执行框架,提升LLM在长流程任务中的可靠性。 large language model
15 ProofFlow: A Dependency Graph Approach to Faithful Proof Autoformalization 提出ProofFlow,通过依赖图提升定理证明自动形式化的语义忠实度。 large language model
16 Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction Gelina:提出一种基于交错Token预测的统一语音和手势合成框架 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
17 A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services 提出一种灵活的多智能体深度强化学习框架,用于延迟敏感服务的动态路由和调度。 reinforcement learning deep reinforcement learning DRL
18 Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning 大规模实证研究揭示了合作多智能体强化学习中鲁棒性与韧性的关键影响因素。 reinforcement learning
19 Automatic Music Sample Identification with Multi-Track Contrastive Learning 提出基于多轨对比学习的音乐采样自动识别方法,显著提升采样检测性能。 contrastive learning
20 SR-Scientist: Scientific Equation Discovery With Agentic AI SR-Scientist:利用Agentic AI进行科学方程发现,提升方程探索效率与精度 reinforcement learning large language model
21 Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics 提出AI-Agent School系统,利用双重记忆自进化机制模拟高保真教育动态。 teacher-student large language model
22 Aligning Deep Implicit Preferences by Learning to Reason Defensively 提出CDRA框架,通过防御性推理对齐深度隐式偏好,提升LLM用户交互效果 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
23 TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models TabVLA:针对视觉-语言-动作模型的有目标后门攻击 manipulation embodied AI vision-language-action
24 Large Language Models Are Effective Code Watermarkers 提出 CodeMark-LLM,利用大语言模型实现高效的代码水印嵌入与提取。 manipulation large language model

🔬 支柱四:生成式动作 (Generative Motion) (2 篇)

#题目一句话要点标签🔗
25 BlackIce: A Containerized Red Teaming Toolkit for AI Security Testing BlackIce:用于AI安全测试的容器化红队工具包,降低AI红队门槛。 penetration large language model
26 PACEbench: A Framework for Evaluating Practical AI Cyber-Exploitation Capabilities PACEbench:评估AI网络攻击能力的实用基准框架与智能体 penetration large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页