cs.AI(2026-01-08)

📊 共 51 篇论文 | 🔗 6 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (34 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (15 🔗2) 支柱一:机器人控制 (Robot Control) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (34 篇)

#题目一句话要点标签🔗
1 SciIF: Benchmarking Scientific Instruction Following Towards Rigorous Scientific Intelligence SciIF:提出科学指令遵循基准,评估LLM在科学推理中的严谨性 large language model instruction following
2 Bridging Temporal and Textual Modalities: A Multimodal Framework for Automated Cloud Failure Root Cause Analysis 提出一种多模态框架,用于自动化的云故障根因分析,弥合时间序列和文本模态之间的鸿沟。 large language model multimodal
3 Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning InstruCoT:通过多样数据合成与指令级CoT学习增强LLM抵御Prompt注入攻击 large language model chain-of-thought
4 Challenges and Research Directions for Large Language Model Inference Hardware 针对大语言模型推理硬件挑战,提出高带宽闪存、近内存计算等架构优化方向 large language model
5 Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop 研究大型语言模型在自消费执行循环中的偏差,并提出相应的缓解策略。 large language model
6 Large language models can effectively convince people to believe conspiracies 大型语言模型能有效说服人们相信阴谋论,但纠正措施可缓解 large language model
7 An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions 研究表明大语言模型在表格数据存在扭曲时缺乏鲁棒性,需显式提示才能部分纠正。 large language model
8 DVD: A Robust Method for Detecting Variant Contamination in Large Language Model Evaluation 提出DVD方法以解决大语言模型评估中的变体污染问题 large language model
9 AECV-Bench: Benchmarking Multimodal Models on Architectural and Engineering Drawings Understanding AECV-Bench:用于建筑工程图理解的多模态模型基准测试 multimodal
10 Enhancing Multimodal Retrieval via Complementary Information Extraction and Alignment 提出CIEA,通过互补信息提取与对齐增强多模态检索效果 multimodal
11 AdaptEval: A Benchmark for Evaluating Large Language Models on Code Snippet Adaptation AdaptEval:用于评估大型语言模型在代码片段适配能力上的基准测试。 large language model
12 Token-Level LLM Collaboration via FusionRoute FusionRoute:一种基于token级LLM协作的路由融合框架 large language model instruction following
13 BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents BackdoorAgent:针对LLM Agent的统一后门攻击框架 large language model multimodal
14 CircuitLM: A Multi-Agent LLM-Aided Design Framework for Generating Circuit Schematics from Natural Language Prompts CircuitLM:多智能体LLM辅助电路设计框架,从自然语言生成电路原理图 large language model chain-of-thought
15 CAOS: Conformal Aggregation of One-Shot Predictors 提出CAOS框架,通过集成单样本预测器并结合一致性预测,实现快速自适应和可靠的不确定性量化。 foundation model
16 Higher-Order Knowledge Representations for Agentic Scientific Reasoning 提出基于超图的知识表示方法,用于Agentic科学推理,加速新材料发现。 large language model
17 Neurosymbolic Retrievers for Retrieval-augmented Generation 提出神经符号检索器,提升检索增强生成的可解释性和性能 large language model
18 Internal Representations as Indicators of Hallucinations in Agent Tool Selection 利用LLM内部表征实时检测Agent工具选择中的幻觉问题 large language model
19 Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models ReasonMark:一种面向大语言模型推理过程的语义引导水印方法 large language model
20 Arabic Prompts with English Tools: A Benchmark 提出Arabic Prompts with English Tools基准,评估阿拉伯语提示下LLM的工具调用能力。 large language model
21 Chain-of-Sanitized-Thoughts: Plugging PII Leakage in CoT of Large Reasoning Models 提出PII-CoT-Bench,通过prompt和微调提升大模型CoT推理的隐私性,减少PII泄露。 chain-of-thought
22 Publishing FAIR and Machine-actionable Reviews in Materials Science: The Case for Symbolic Knowledge in Neuro-symbolic Artificial Intelligence 在材料科学中发布FAIR和机器可操作的评论:神经符号人工智能中符号知识的案例 large language model
23 T-Retriever: Tree-based Hierarchical Retrieval Augmented Generation for Textual Graphs T-Retriever:提出基于树形结构的层级检索增强生成框架,用于处理文本图推理任务。 large language model
24 CurricuLLM: Designing Personalized and Workforce-Aligned Cybersecurity Curricula Using Fine-Tuned LLMs CurricuLLM:利用微调LLM设计个性化、工作导向的自动化网络安全课程 large language model
25 Orchestrating Intelligence: Confidence-Aware Routing for Efficient Multi-Agent Collaboration across Multi-Scale Models 提出OI-MAS框架,通过置信度感知路由实现多尺度模型高效多智能体协作 large language model
26 DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation 提出DR-LoRA以解决Mixture-of-Experts适应中的资源不匹配问题 large language model
27 Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning 提出CompassMem,利用事件中心记忆作为逻辑地图,提升Agent的搜索和推理能力 large language model
28 Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search 提出M-ASK框架,解耦Agentic搜索中的搜索行为与知识管理,提升多跳问答性能。 large language model
29 LLM-Guided Quantified SMT Solving over Uninterpreted Functions AquaForte:利用LLM指导的量化SMT求解,解决未解释函数问题 large language model
30 LAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz Divergence LAMB:通过柯西-施瓦茨散度桥接模态鸿沟的LLM音频描述框架 large language model
31 Vibe Coding an LLM-powered Theorem Prover Isabellm:一种基于LLM的Isabelle/HOL定理证明器,实现全自动证明合成 large language model
32 Beyond the "Truth": Investigating Election Rumors on Truth Social During the 2024 Election 提出基于LLM的多阶段谣言检测Agent,用于分析Truth Social平台2024年选举谣言。 large language model
33 Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks 提出Constitutional Classifiers++,高效防御通用越狱攻击,降低计算成本和拒绝率。 large language model
34 GUITester: Enabling GUI Agents for Exploratory Defect Discovery 提出GUITester以解决GUI缺陷自主发现问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
35 ThinkDrive: Chain-of-Thought Guided Progressive Reinforcement Learning Fine-Tuning for Autonomous Driving ThinkDrive:基于思维链引导的渐进式强化学习微调自动驾驶 reinforcement learning large language model chain-of-thought
36 ConMax: Confidence-Maximizing Compression for Efficient Chain-of-Thought Reasoning 提出ConMax,通过置信度最大化压缩CoT推理链,提升效率。 reinforcement learning chain-of-thought
37 TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning TourPlanner:基于约束门控强化学习的竞争共识框架,用于旅行规划 reinforcement learning chain-of-thought
38 SimuAgent: An LLM-Based Simulink Modeling Assistant Enhanced with Reinforcement Learning SimuAgent:基于LLM和强化学习的Simulink建模助手 reinforcement learning large language model
39 Thinking-Based Non-Thinking: Solving the Reward Hacking Problem in Training Hybrid Reasoning Models via Reinforcement Learning 提出Thinking-Based Non-Thinking方法,解决混合推理模型强化学习训练中的奖励欺骗问题。 reinforcement learning chain-of-thought
40 LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models 提出LLM集成的自动仇恨言论识别模型,通过可控文本生成提升审查性能。 curriculum learning large language model chain-of-thought
41 Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing 提出基于大规模行为克隆的实时视频游戏AI模型,提升因果推理能力 behavior cloning foundation model
42 Reasoning Over Space: Enabling Geographic Reasoning for LLM-Based Generative Next POI Recommendation 提出ROS框架,利用地理信息增强LLM的生成式下一地点推荐能力 reinforcement learning large language model chain-of-thought
43 Tape: A Cellular Automata Benchmark for Evaluating Rule-Shift Generalization in Reinforcement Learning 提出Tape:一个细胞自动机基准,用于评估强化学习中的规则转移泛化能力 reinforcement learning world model
44 Integrating Distribution Matching into Semi-Supervised Contrastive Learning for Labeled and Unlabeled Data 提出结合分布匹配的半监督对比学习,提升标签和无标签数据利用率 contrastive learning
45 Reinforced Efficient Reasoning via Semantically Diverse Exploration ROSE:通过语义多样性探索增强LLM的强化高效推理 reinforcement learning large language model
46 SmartSearch: Process Reward-Guided Query Refinement for Search Agents SmartSearch:提出过程奖励引导的查询优化框架,提升搜索Agent的知识检索能力 curriculum learning large language model
47 SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning 提出SCALER框架,通过自适应环境设计提升LLM的推理能力 reinforcement learning large language model
48 A Method for Constructing a Digital Transformation Driving Mechanism Based on Semantic Understanding of Large Models 提出基于大模型语义理解的数字化转型驱动机制,提升决策效率 reinforcement learning large language model
49 ResMAS: Resilience Optimization in LLM-based Multi-agent Systems ResMAS:提升基于LLM的多智能体系统在扰动下的鲁棒性 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
50 Learning Latent Action World Models In The Wild 提出一种在真实场景视频中学习隐式动作世界模型的方法 manipulation world model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
51 Conversational AI for Rapid Scientific Prototyping: A Case Study on ESA's ELOPE Competition 利用对话式AI快速原型设计:ESA ELOPE竞赛案例研究 optical flow large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页