cs.AI（2026-01-30）

📊 共 41 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (29 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (10 🔗3) 支柱四：生成式动作 (Generative Motion) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (29 篇)

#	题目	一句话要点	标签	🔗
1	BEAR: Towards Beam-Search-Aware Optimization for Recommendation with Large Language Models	BEAR：面向大语言模型推荐，提出波束搜索感知的优化方法	large language model
2	Evaluating Large Language Models for Security Bug Report Prediction	评估大型语言模型在安全漏洞报告预测中的应用	large language model
3	RAudit: A Blind Auditing Protocol for Large Language Model Reasoning	提出RAudit以解决大型语言模型推理中的盲审计问题	large language model
4	Chain-of-thought obfuscation learned from output supervision can generalise to unseen tasks	基于输出监督的思维链混淆学习可泛化至未见任务	chain-of-thought
5	Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild	提出JudgeGPT和RogueGPT双轴框架，分析人类对大型模型幻觉和虚假信息的易感性	foundation model
6	Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models	提出基于嵌入空间几何的隐写术与检测方法，提升大语言模型隐蔽通信安全性。	large language model
7	Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling	提出SABER方法，通过小样本量预测大规模语言模型在Best-of-N采样下的对抗风险。	large language model
8	EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models	提出EntroCut，通过熵引导自适应截断提升小规模LRM的CoT推理效率。	chain-of-thought
9	Make Anything Match Your Target: Universal Adversarial Perturbations against Closed-Source MLLMs via Multi-Crop Routed Meta Optimization	提出MCRMO-Attack，提升通用目标可迁移对抗攻击在闭源多模态大语言模型上的成功率。	large language model multimodal
10	Quantifying Model Uniqueness in Heterogeneous AI Ecosystems	提出统计框架以审计异构AI生态系统中的模型独特性	large language model foundation model
11	Alignment among Language, Vision and Action Representations	研究揭示语言、视觉和动作表征之间的对齐现象，促进跨模态知识迁移。	embodied AI large language model
12	Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution	提出达尔文记忆系统，解决GUI Agent在长程任务中的上下文不足问题	large language model multimodal
13	On the Impact of Code Comments for Automated Bug-Fixing: An Empirical Study	探讨代码注释对自动化修复bug的影响	large language model
14	UCPO: Uncertainty-Aware Policy Optimization	提出UCPO框架，解决LLM中基于不确定性的强化学习策略优化中的偏差问题。	large language model
15	High-quality generation of dynamic game content via small language models: A proof of concept	提出一种基于小语言模型的高质量动态游戏内容生成方法，解决叙事连贯性和高运营成本问题。	large language model
16	OrLog: Resolving Complex Queries with LLMs and Probabilistic Reasoning	OrLog：结合LLM和概率推理解决复杂查询，提升检索精度。	large language model
17	From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics	ContextMATH基准测试揭示LLM在情境数学推理中问题建模能力的不足	large language model
18	Protecting Private Code in IDE Autocomplete using Differential Privacy	利用差分隐私保护IDE代码自动补全中的私有代码	large language model
19	Game-Theoretic Co-Evolution for LLM-Based Heuristic Discovery	提出ASRO框架以解决LLM基础启发式发现中的过拟合问题	large language model
20	MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering	MEnvAgent：用于可验证软件工程的可扩展多语言环境构建框架	large language model	✅
21	Conditional Performance Guarantee for Large Reasoning Models	提出G-PAC推理框架，为大模型推理提供分组条件下的性能保证，提升效率。	chain-of-thought
22	How Far Can Pretrained LLMs Go in Symbolic Music? Controlled Comparisons of Supervised and Preference-based Adaptation	对比监督学习与偏好学习，评估预训练LLM在符号音乐领域的潜力	large language model
23	Qualitative Evaluation of LLM-Designed GUI	评估LLM设计的GUI：可用性、可定制性与用户需求匹配度分析	large language model
24	AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement	AutoRefine：通过轨迹提炼可复用经验，持续优化LLM Agent	large language model
25	Task-Aware LLM Council with Adaptive Decision Pathways for Decision Support	提出任务感知LLM委员会（TALC），用于自适应决策支持。	large language model
26	MCP-Diag: A Deterministic, Protocol-Driven Architecture for AI-Native Network Diagnostics	MCP-Diag：一种确定性的、协议驱动的AI原生网络诊断架构	large language model
27	SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly	SYMPHONY：异构语言模型协同的多智能体规划框架，提升复杂任务解决能力	large language model
28	PerfGuard: A Performance-Aware Agent for Visual Content Generation	PerfGuard：一种面向视觉内容生成的性能感知Agent框架	large language model	✅
29	Decoding in Geometry: Alleviating Embedding-Space Crowding for Complex Reasoning	提出CraEG，通过几何引导重加权缓解LLM推理中嵌入空间拥挤问题	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

#	题目	一句话要点	标签	🔗
30	THINKSAFE: Self-Generated Safety Alignment for Reasoning Models	ThinkSafe：通过自生成安全对齐提升推理模型安全性，同时保持推理能力。	reinforcement learning distillation chain-of-thought	✅
31	CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning	提出CVeDRL：一种基于难度感知强化学习的高效代码验证器	reinforcement learning reward shaping	✅
32	Real-Time Aligned Reward Model beyond Semantics	提出R2M：一种利用策略反馈的实时对齐奖励模型，缓解奖励过度优化问题	reinforcement learning RLHF large language model
33	A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization	提出最小前缀比率MinPRO以稳定策略优化	reinforcement learning large language model
34	Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning	AutoTraj：通过修复和奖励工具使用轨迹，提升工具集成推理能力	reinforcement learning large language model
35	MulFeRL: Enhancing Reinforcement Learning with Verbal Feedback in a Multi-turn Loop	MulFeRL：多轮循环中利用口头反馈增强强化学习	reinforcement learning
36	TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization	提出TSPO，解决多轮搜索策略优化中的双重同质化难题	reinforcement learning large language model
37	Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments	提出测试时世界模型混合（TMoW）框架，提升具身智能体在动态环境中的适应性。	world model
38	Learn More with Less: Uncertainty Consistency Guided Query Selection for RLVR	提出不确定性一致性引导的查询选择方法，降低RLVR在数学推理任务中的标注成本。	reinforcement learning large language model
39	RulePlanner: All-in-One Reinforcement Learner for Unifying Design Rules in 3D Floorplanning	RulePlanner：用于3D Floorplanning中统一设计规则的一体化强化学习器	reinforcement learning deep reinforcement learning	✅

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
40	WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI	WiFiPenTester：提出一种由GenAI驱动的、可控的无线网络渗透测试系统	penetration large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
41	FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks	FraudShield：利用知识图谱增强LLM防御欺诈攻击的能力	manipulation large language model

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2026-01-30）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (29 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理