cs.AI(2025-10-30)

📊 共 34 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (29 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (4) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (29 篇)

#题目一句话要点标签🔗
1 Unveiling Intrinsic Text Bias in Multimodal Large Language Models through Attention Key-Space Analysis 通过注意力键空间分析揭示多模态大语言模型中固有的文本偏见 large language model multimodal
2 Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives 提出NeuBAROCO基准,对比评估LLM在逻辑和模态视角下的规范推理能力 large language model
3 Agentic AI Home Energy Management System: A Large Language Model Framework for Residential Load Scheduling 提出基于LLM的智能家居能源管理系统,实现住宅负荷优化调度 large language model
4 SecureReviewer: Enhancing Large Language Models for Secure Code Review through Secure-aware Fine-tuning SecureReviewer:通过安全感知微调增强大型语言模型以实现安全代码审查 large language model
5 CausalGuard: A Smart System for Detecting and Preventing False Information in Large Language Models CausalGuard:利用因果推理与符号逻辑检测并预防大语言模型中的虚假信息 large language model
6 Chain-of-Thought Hijacking 提出CoT Hijacking攻击,揭示思维链推理中大型语言模型的安全漏洞 chain-of-thought
7 Unvalidated Trust: Cross-Stage Vulnerabilities in Large Language Model Architectures 揭示LLM多阶段流水线中的信任漏洞,提出零信任架构Countermind large language model
8 Urban-MAS: Human-Centered Urban Prediction with LLM-Based Multi-Agent System 提出Urban-MAS框架以解决人本城市预测问题 large language model multimodal
9 LLMs are Overconfident: Evaluating Confidence Interval Calibration with FermiEval FermiEval评估LLM置信区间校准,揭示其过度自信问题并提出校正方法 large language model
10 Autograder+: A Multi-Faceted AI Framework for Rich Pedagogical Feedback in Programming Education Autograder+:用于编程教育中提供丰富教学反馈的多方面AI框架 large language model
11 Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings Scales++:利用认知尺度嵌入实现计算高效的评估子集选择 large language model
12 QuantumBench: A Benchmark for Quantum Problem Solving 提出QuantumBench以评估量子领域中的大语言模型 large language model
13 Beyond Synthetic Benchmarks: Evaluating LLM Performance on Real-World Class-Level Code Generation 提出真实类级别代码生成基准,评估LLM在实际场景下的性能瓶颈与改进策略 large language model
14 LLM-based Multi-class Attack Analysis and Mitigation Framework in IoT/IIoT Networks 提出基于LLM的物联网/工业物联网多分类攻击分析与缓解框架 large language model
15 Artificial Intelligence in Elementary STEM Education: A Systematic Review of Current Applications and Future Challenges 系统性回顾AI在小学STEM教育中的应用,揭示挑战并展望未来方向 multimodal
16 Cognition Envelopes for Bounded AI Reasoning in Autonomous UAS Operations 提出认知包络,约束自主无人机系统中AI推理的决策边界 large language model
17 How Similar Are Grokipedia and Wikipedia? A Multi-Dimensional Textual and Structural Comparison 对比Grokipedia与维基百科:多维度文本与结构分析揭示AI生成百科全书的潜在偏见 large language model
18 ExpertFlow: Adaptive Expert Scheduling and Memory Coordination for Efficient MoE Inference ExpertFlow:自适应专家调度与内存协调,提升MoE模型推理效率 large language model
19 Delegated Authorization for Agents Constrained to Semantic Task-to-Scope Matching 提出基于语义任务-范围匹配的代理授权模型,解决大模型工具调用安全问题 large language model
20 CATArena: Evaluating Evolutionary Capabilities of Code Agents via Iterative Tournaments CATArena:通过迭代竞赛评估代码智能体的演化能力 large language model
21 Who Has The Final Say? Conformity Dynamics in ChatGPT's Selections 揭示ChatGPT在招聘决策中易受社会影响的特性,强调AI决策的独立性风险 large language model
22 Broken-Token: Filtering Obfuscated Prompts by Counting Characters-Per-Token 提出CPT-Filtering,通过统计单Token字符数过滤混淆提示词,防御LLM越狱攻击。 large language model
23 BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning 提出BOTS框架以解决LLM强化微调中的任务选择问题 large language model
24 GraphCompliance: Aligning Policy and Context Graphs for LLM-Based Regulatory Compliance GraphCompliance:对齐策略图和上下文图,用于LLM的监管合规 large language model
25 SynBullying: A Multi LLM Synthetic Conversational Dataset for Cyberbullying Detection 提出SynBullying:一个用于网络欺凌检测的多LLM合成对话数据集 large language model
26 Linking Heterogeneous Data with Coordinated Agent Flows for Social Media Analysis SIA:利用协同Agent流连接异构数据,用于社交媒体分析 large language model
27 ToolRM: Towards Agentic Tool-Use Reward Modeling 提出ToolRM,用于提升Agent在工具使用场景下的奖励建模能力。 large language model
28 The FM Agent 提出FM Agent以解决复杂科学与工程问题 large language model
29 Beyond Benchmarks: The Economics of AI Inference 构建LLM推理经济学框架,揭示成本、规模与质量间的关系 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
30 Cross-Platform Evaluation of Reasoning Capabilities in Foundation Models 跨平台评估基础模型推理能力,揭示训练数据质量的重要性 Mamba foundation model
31 e1: Learning Adaptive Control of Reasoning Effort 提出自适应努力控制方法,实现推理资源按需分配,提升成本-精度权衡。 reinforcement learning chain-of-thought
32 The Era of Agentic Organization: Learning to Organize with Language Models 提出AsyncThink,利用语言模型实现高效协同的问题求解 reinforcement learning large language model
33 Reasoning Curriculum: Bootstrapping Broad LLM Reasoning from Math 提出推理课程,通过数学引导提升大语言模型在多领域的推理能力 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
34 Using Salient Object Detection to Identify Manipulative Cookie Banners that Circumvent GDPR 利用显著性目标检测识别规避GDPR的操纵性Cookie横幅 manipulation

⬅️ 返回 cs.AI 首页 · 🏠 返回主页