cs.AI(2026-01-30)

📊 共 41 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (29 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (10 🔗3) 支柱四:生成式动作 (Generative Motion) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (29 篇)

#题目一句话要点标签🔗
1 BEAR: Towards Beam-Search-Aware Optimization for Recommendation with Large Language Models BEAR:面向大语言模型推荐,提出波束搜索感知的优化方法 large language model
2 Evaluating Large Language Models for Security Bug Report Prediction 评估大型语言模型在安全漏洞报告预测中的应用 large language model
3 RAudit: A Blind Auditing Protocol for Large Language Model Reasoning 提出RAudit以解决大型语言模型推理中的盲审计问题 large language model
4 Chain-of-thought obfuscation learned from output supervision can generalise to unseen tasks 基于输出监督的思维链混淆学习可泛化至未见任务 chain-of-thought
5 Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild 提出JudgeGPT和RogueGPT双轴框架,分析人类对大型模型幻觉和虚假信息的易感性 foundation model
6 Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models 提出基于嵌入空间几何的隐写术与检测方法,提升大语言模型隐蔽通信安全性。 large language model
7 Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling 提出SABER方法,通过小样本量预测大规模语言模型在Best-of-N采样下的对抗风险。 large language model
8 EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models 提出EntroCut,通过熵引导自适应截断提升小规模LRM的CoT推理效率。 chain-of-thought
9 Make Anything Match Your Target: Universal Adversarial Perturbations against Closed-Source MLLMs via Multi-Crop Routed Meta Optimization 提出MCRMO-Attack,提升通用目标可迁移对抗攻击在闭源多模态大语言模型上的成功率。 large language model multimodal
10 Quantifying Model Uniqueness in Heterogeneous AI Ecosystems 提出统计框架以审计异构AI生态系统中的模型独特性 large language model foundation model
11 Alignment among Language, Vision and Action Representations 研究揭示语言、视觉和动作表征之间的对齐现象,促进跨模态知识迁移。 embodied AI large language model
12 Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution 提出达尔文记忆系统,解决GUI Agent在长程任务中的上下文不足问题 large language model multimodal
13 On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study 探讨代码注释对自动化修复bug的影响 large language model
14 UCPO: Uncertainty-Aware Policy Optimization 提出UCPO框架,解决LLM中基于不确定性的强化学习策略优化中的偏差问题。 large language model
15 High-quality generation of dynamic game content via small language models: A proof of concept 提出一种基于小语言模型的高质量动态游戏内容生成方法,解决叙事连贯性和高运营成本问题。 large language model
16 OrLog: Resolving Complex Queries with LLMs and Probabilistic Reasoning OrLog:结合LLM和概率推理解决复杂查询,提升检索精度。 large language model
17 From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics ContextMATH基准测试揭示LLM在情境数学推理中问题建模能力的不足 large language model
18 Protecting Private Code in IDE Autocomplete using Differential Privacy 利用差分隐私保护IDE代码自动补全中的私有代码 large language model
19 Game-Theoretic Co-Evolution for LLM-Based Heuristic Discovery 提出ASRO框架以解决LLM基础启发式发现中的过拟合问题 large language model
20 MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering MEnvAgent:用于可验证软件工程的可扩展多语言环境构建框架 large language model
21 Conditional Performance Guarantee for Large Reasoning Models 提出G-PAC推理框架,为大模型推理提供分组条件下的性能保证,提升效率。 chain-of-thought
22 How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation 对比监督学习与偏好学习,评估预训练LLM在符号音乐领域的潜力 large language model
23 Qualitative Evaluation of LLM-Designed GUI 评估LLM设计的GUI:可用性、可定制性与用户需求匹配度分析 large language model
24 AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement AutoRefine:通过轨迹提炼可复用经验,持续优化LLM Agent large language model
25 Task-Aware LLM Council with Adaptive Decision Pathways for Decision Support 提出任务感知LLM委员会(TALC),用于自适应决策支持。 large language model
26 MCP-Diag: A Deterministic, Protocol-Driven Architecture for AI-Native Network Diagnostics MCP-Diag:一种确定性的、协议驱动的AI原生网络诊断架构 large language model
27 SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly SYMPHONY:异构语言模型协同的多智能体规划框架,提升复杂任务解决能力 large language model
28 PerfGuard: A Performance-Aware Agent for Visual Content Generation PerfGuard:一种面向视觉内容生成的性能感知Agent框架 large language model
29 Decoding in Geometry: Alleviating Embedding-Space Crowding for Complex Reasoning 提出CraEG,通过几何引导重加权缓解LLM推理中嵌入空间拥挤问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (10 篇)

#题目一句话要点标签🔗
30 THINKSAFE: Self-Generated Safety Alignment for Reasoning Models ThinkSafe:通过自生成安全对齐提升推理模型安全性,同时保持推理能力。 reinforcement learning distillation chain-of-thought
31 CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning 提出CVeDRL:一种基于难度感知强化学习的高效代码验证器 reinforcement learning reward shaping
32 Real-Time Aligned Reward Model beyond Semantics 提出R2M:一种利用策略反馈的实时对齐奖励模型,缓解奖励过度优化问题 reinforcement learning RLHF large language model
33 A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization 提出最小前缀比率MinPRO以稳定策略优化 reinforcement learning large language model
34 Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning AutoTraj:通过修复和奖励工具使用轨迹,提升工具集成推理能力 reinforcement learning large language model
35 MulFeRL: Enhancing Reinforcement Learning with Verbal Feedback in a Multi-turn Loop MulFeRL:多轮循环中利用口头反馈增强强化学习 reinforcement learning
36 TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization 提出TSPO,解决多轮搜索策略优化中的双重同质化难题 reinforcement learning large language model
37 Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments 提出测试时世界模型混合(TMoW)框架,提升具身智能体在动态环境中的适应性。 world model
38 Learn More with Less: Uncertainty Consistency Guided Query Selection for RLVR 提出不确定性一致性引导的查询选择方法,降低RLVR在数学推理任务中的标注成本。 reinforcement learning large language model
39 RulePlanner: All-in-One Reinforcement Learner for Unifying Design Rules in 3D Floorplanning RulePlanner:用于3D Floorplanning中统一设计规则的一体化强化学习器 reinforcement learning deep reinforcement learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
40 WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI WiFiPenTester:提出一种由GenAI驱动的、可控的无线网络渗透测试系统 penetration large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
41 FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks FraudShield:利用知识图谱增强LLM防御欺诈攻击的能力 manipulation large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页