cs.AI(2026-02-28)

📊 共 70 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (53) 支柱二:RL算法与架构 (RL & Architecture) (14) 支柱一:机器人控制 (Robot Control) (3)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (53 篇)

#题目一句话要点标签🔗
1 NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning 提出NoRD,一种数据高效的免推理端到端自动驾驶VLA模型 vision-language-action VLA
2 Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads 提出基于多模态LLM的视频广告Hooking Period分析框架,提升广告效果评估与优化。 large language model multimodal
3 SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy SPM-Bench:针对扫描探针显微镜的大语言模型权威自动化评测基准 large language model multimodal
4 RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge RAGdb:一种零依赖、可嵌入的边缘多模态RAG架构 large language model multimodal
5 G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge 提出G-reasoner,用于统一推理图结构知识的基座模型框架。 large language model foundation model
6 FM-RME: Foundation Model Empowered Radio Map Estimation 提出FM-RME,赋能多维无线电地图估计,实现零样本泛化。 foundation model
7 Mapping the Landscape of Artificial Intelligence in Life Cycle Assessment Using Large Language Models 利用大语言模型绘制生命周期评估中人工智能应用图谱 large language model
8 Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models 提出元认知行为调整MBT,提升大语言模型复杂推理的稳定性和准确性 large language model
9 Multi-Agent Large Language Model Based Emotional Detoxification Through Personalized Intensity Control for Consumer Protection 提出基于多Agent LLM的情绪解毒系统MALLET,以个性化强度控制保护消费者 large language model
10 Enriching Taxonomies Using Large Language Models Taxoria:利用大型语言模型丰富现有分类体系,提升知识检索效果 large language model
11 Automating the Detection of Requirement Dependencies Using Large Language Models 提出LEREDD,利用大语言模型自动检测需求依赖关系 large language model
12 LLM4AD: A Platform for Algorithm Design with Large Language Model LLM4AD:一个基于大语言模型的算法设计统一平台 large language model
13 FHIR-RAG-MEDS: Integrating HL7 FHIR with Retrieval-Augmented Large Language Models for Enhanced Medical Decision Support 提出FHIR-RAG-MEDS系统,融合HL7 FHIR与RAG,增强医疗决策支持。 large language model
14 Modality-Guided Mixture of Graph Experts with Entropy-Triggered Routing for Multimodal Recommendation 提出MAGNET,通过模态引导的图专家混合网络和熵触发路由解决多模态推荐中的融合难题。 multimodal
15 SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation SC-Arena:一个基于知识增强评估的单细胞推理自然语言基准 large language model foundation model
16 EyeLayer: Integrating Human Attention Patterns into LLM-Based Code Summarization EyeLayer:将人类注意力模式融入LLM代码摘要生成,提升代码理解能力 large language model multimodal
17 ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices ProactiveMobile:一个全面的移动设备主动智能基准测试,旨在提升移动设备的主动智能水平。 large language model multimodal
18 Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction 评估小语言模型在领导者-跟随者交互中的零样本和单样本适应性 large language model
19 CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety CourtGuard:一种模型无关的零样本策略适应框架,用于提升LLM安全性 large language model
20 AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications 提出AMA-Bench评估Agent在长时程记忆应用中的性能,并提出AMA-Agent提升效果。 large language model
21 Intelligence per Watt: Measuring Intelligence Efficiency of Local AI 提出每瓦特智能(IPW)指标,评估本地AI的能效,推动云端负载向本地设备转移。 large language model
22 LeanCat: A Benchmark Suite for Formal Category Theory in Lean (Part I: 1-Categories) LeanCat:用于Lean形式化范畴论的基准测试套件,揭示了现有模型在抽象推理上的不足。 large language model
23 GPT-4o Lacks Core Features of Theory of Mind GPT-4o缺乏核心的心智理论能力,无法建立一致的心智状态模型 large language model
24 Large-scale online deanonymization with LLMs 利用大型语言模型实现大规模在线去匿名化 large language model
25 A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines 提出评估代理(EA)框架,用于决策中心化地评估AutoML Agent的决策质量。 large language model
26 ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization ConstraintBench:评估LLM在直接优化中约束推理能力的基准测试 large language model
27 Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents 利用认知模型与AI算法设计模块化语言智能体 large language model
28 Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention 提出AHCE框架,通过学习策略请求专家知识,提升LLM Agent在复杂任务中的表现 large language model
29 MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios MobilityBench:一个用于评估真实世界出行场景中路径规划Agent的基准测试 large language model
30 Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions 综述个性化LLM驱动Agent:聚焦长期交互中的用户适应与连续性 large language model
31 MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks MiroFlow:面向通用深度研究任务的高性能鲁棒开源Agent框架 large language model
32 Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search 提出CC-BOS框架,利用文言文和果蝇优化算法实现大语言模型的黑盒越狱攻击。 large language model
33 Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design 提出基于LLM的自动启发式设计的AILS-AHD算法,提升CVRP求解性能 large language model
34 A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring 提出决策理论视角的隐写术以监测大型语言模型 large language model
35 Mitigating Legibility Tax with Decoupled Prover-Verifier Games 提出解耦的证明者-验证者博弈,缓解大语言模型的可读性税问题。 large language model
36 Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications 针对电商知识图谱,提出并比较神经检索-重排序RAG流水线,显著提升问答性能。 large language model
37 Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews 针对中文网络,评估搜索引擎、LLM和AI概览中的错误信息暴露风险 large language model
38 From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation 评估LLM在任务并行代码生成中的性能,探索提示工程对代码质量的影响 large language model
39 Analysis of LLMs Against Prompt Injection and Jailbreak Attacks 针对提示注入和越狱攻击,分析多种开源LLM的脆弱性及防御机制 large language model
40 Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents 提出上下文记忆虚拟化CMV,用于LLM Agent中基于DAG的状态管理和结构无损精简。 large language model
41 HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems HubScan:检测检索增强生成系统中枢纽性投毒攻击 large language model
42 Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace 提出Silent Egress攻击,揭示LLM Agent中隐式Prompt注入导致敏感信息泄露的风险 large language model
43 Generative Agents Navigating Digital Libraries Agent4DL:利用生成式Agent模拟数字图书馆用户搜索行为 large language model
44 Addressing Climate Action Misperceptions with Generative AI 利用生成式AI解决气候行动认知偏差,提升环保行为意愿 large language model
45 IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation 提出IMMACULATE框架以解决LLM审计问题 large language model
46 AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification AgentSentry:通过时序因果诊断和上下文净化缓解LLM Agent中的间接提示注入攻击 large language model
47 Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study 提出基于可再生能源消纳窗口的分布式LLM预训练方法 large language model
48 Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression 提出语言模型引导的符号回归,用于发现材料科学中可解释的物理定律 large language model
49 LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure LLMServingSim 2.0:异构和解耦LLM服务基础设施的统一模拟器 large language model
50 Utilizing LLMs for Industrial Process Automation 利用大型语言模型加速工业过程自动化软件开发 large language model
51 Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning 提出Compositional-ARC数据集,并利用元学习提升模型在抽象空间推理中的系统泛化能力。 large language model
52 Unmasking Reasoning Processes: A Process-aware Benchmark for Evaluating Structural Mathematical Reasoning in LLMs 提出 ReasoningMath-Plus 基准,用于评估LLM在结构化数学推理中的过程能力。 large language model
53 Toward Automated Validation of Language Model Synthesized Test Cases using Semantic Entropy VALTEST:利用语义熵自动验证语言模型生成的测试用例,提升代码生成质量 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
54 Knowledge Fusion of Large Language Models Via Modular SkillPacks 提出GraftLLM,通过模块化SkillPack实现大语言模型的知识融合与迁移。 distillation large language model
55 Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation GYWI:融合合作者图谱与RAG,赋能大语言模型进行科学创意生成 reinforcement learning large language model
56 FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning 提出FactGuard以解决视频虚假信息检测中的推理不足问题 reinforcement learning large language model multimodal
57 CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines 提出对比世界模型(CWM),用于具身智能体中动作可行性学习。 world model affordance large language model
58 RLHFless: Serverless Computing for Efficient RLHF 提出RLHFless,利用Serverless计算高效训练RLHF,提升资源利用率并降低成本。 reinforcement learning RLHF large language model
59 Agentic AI for Intent-driven Optimization in Cell-free O-RAN 提出Agentic AI框架,用于Cell-free O-RAN中意图驱动的优化,实现节能和资源高效利用。 reinforcement learning deep reinforcement learning DRL
60 The Trinity of Consistency as a Defining Principle for General World Models 提出通用世界模型的“一致性三位一体”原则,并构建多帧推理与生成基准CoW-Bench。 world model multimodal
61 K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model K-Search:通过协同演化的内在世界模型生成LLM Kernel,显著提升GPU Kernel优化效率。 world model large language model
62 Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive 揭示基于优化的AI系统在规范响应上的架构局限性,强调其与真正能动性的不兼容性 reinforcement learning RLHF large language model
63 QSIM: Mitigating Overestimation in Multi-Agent Reinforcement Learning via Action Similarity Weighted Q-Learning QSIM:通过动作相似性加权Q学习缓解多智能体强化学习中的过度估计 reinforcement learning
64 Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space 提出L-HAKT,利用LLM和双曲空间对齐学生行为,提升知识追踪效果。 contrastive learning large language model
65 Automated Vulnerability Detection in Source Code Using Deep Representation Learning 提出基于深度学习的C代码漏洞自动检测方法,提升漏洞召回率。 representation learning
66 On Discovering Algorithms for Adversarial Imitation Learning 提出数据驱动的奖励分配函数以解决对抗模仿学习的不稳定性问题 imitation learning
67 Towards Small Language Models for Security Query Generation in SOC Workflows 提出面向安全运营中心工作流的小型语言模型,用于安全查询生成。 distillation chain-of-thought

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
68 DropVLA: An Action-Level Backdoor Attack on Vision--Language--Action Models 提出DropVLA,实现对Vision-Language-Action模型细粒度动作级后门攻击 manipulation vision-language-action VLA
69 Beyond the Monitor: Mixed Reality Visualization and Multimodal AI for Enhanced Digital Pathology Workflow PathVis:混合现实病理诊断平台,结合多模态AI提升工作流效率 Apple Vision Pro multimodal
70 The AI Research Assistant: Promise, Peril, and a Proof of Concept 通过人机协作发现Hermite求积新误差表示与界限 manipulation

⬅️ 返回 cs.AI 首页 · 🏠 返回主页