cs.AI（2026-01-13）

📊 共 31 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (19 🔗3) 支柱二：RL算法与架构 (RL & Architecture) (10 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱六：视频提取与匹配 (Video Extraction) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (19 篇)

#	题目	一句话要点	标签	🔗	⭐
1	VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking	提出VeriTaS：首个多模态自动事实核查动态基准，应对LLM预训练带来的数据泄露问题。	foundation model multimodal
2	Resisting Manipulative Bots in Memecoin Copy Trading: A Multi-Agent Approach with Chain-of-Thought Reasoning	提出基于Chain-of-Thought多Agent系统的Memecoin跟单交易方法，抵抗操纵性机器人。	large language model chain-of-thought
3	What If TSF: A Benchmark for Reframing Forecasting as Scenario-Guided Multimodal Forecasting	提出What If TSF基准，用于评估情景引导的多模态时间序列预测模型	large language model multimodal	✅
4	Enriching Semantic Profiles into Knowledge Graph for Recommender Systems Using Large Language Models	提出SPiKE模型，利用大语言模型增强知识图谱推荐系统中的语义表示。	large language model
5	Uncovering Political Bias in Large Language Models using Parliamentary Voting Records	提出PoliBias基准，揭示大型语言模型在议会投票记录中的政治偏见	large language model
6	An Under-Explored Application for Explainable Multimodal Misogyny Detection in code-mixed Hindi-English	提出一种可解释的多模态仇恨言论检测Web应用，用于印地语-英语混合语境	multimodal
7	MPCI-Bench: A Benchmark for Multimodal Pairwise Contextual Integrity Evaluation of Language Model Agents	MPCI-Bench：用于评估语言模型智能体多模态情境完整性的基准	multimodal
8	DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection	提出双层嵌套指纹技术以解决大语言模型知识产权保护问题	large language model
9	ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios	ViDoRe V3：提出一个综合性的多模态RAG基准，用于评估复杂现实场景下的检索增强生成。	multimodal visual grounding
10	MEMEWEAVER: Inter-Meme Graph Reasoning for Sexism and Misogyny Detection	MemeWeaver：提出基于Meme间图推理的性别歧视和厌女症检测框架。	multimodal
11	MirrorBench: An Extensible Framework to Evaluate User-Proxy Agents for Human-Likeness	提出MirrorBench框架，用于评估用户代理生成类人对话的能力。	large language model	✅
12	Why AI Alignment Failure Is Structural: Learned Human Interaction Structures and AGI as an Endogenous Evolutionary Shock	AI对齐失败的结构性根源：学习人类交互结构与AGI的内生演化冲击	large language model
13	Prism: Towards Lowering User Cognitive Load in LLMs via Complex Intent Understanding	Prism：通过复杂意图理解降低LLM交互中的用户认知负荷	large language model
14	Learner-Tailored Program Repair: A Solution Generator with Iterative Edit-Driven Retrieval Enhancement	提出学习者定制程序修复方法以解决编程学习者的代码错误问题	large language model
15	SUMMPILOT: Bridging Efficiency and Customization for Interactive Summarization System	SUMMPILOT：交互式摘要系统，兼顾效率与用户定制化需求	large language model
16	M3-BENCH: Process-Aware Evaluation of LLM Agents Social Behaviors in Mixed-Motive Games	提出M3-Bench，用于在混合动机博弈中评估LLM智能体的社会行为	large language model
17	Regulatory gray areas of LLM Terms	分析LLM服务条款的监管灰色地带，揭示科研使用中的不确定性	large language model
18	Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression	提出HS2C框架，利用同质性压缩文本属性图，提升LLM推理性能。	large language model
19	The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios	提出EvoEnv动态评估环境，解决多模态大模型在工作场景中的学习、探索和调度问题。	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Large Artificial Intelligence Model Guided Deep Reinforcement Learning for Resource Allocation in Non Terrestrial Networks	提出基于大模型引导的深度强化学习方法，用于非地面网络资源分配。	reinforcement learning deep reinforcement learning DRL
21	Hybrid Distillation with CoT Guidance for Edge-Drone Control Code Generation	提出基于混合蒸馏与CoT指导的边缘无人机控制代码生成方法	distillation large language model chain-of-thought
22	YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation	YaPO：通过可学习的稀疏激活引导向量实现领域自适应	DPO direct preference optimization large language model	✅
23	Owen-Shapley Policy Optimization (OSPO): A Principled RL Algorithm for Generative Search LLMs	提出Owen-Shapley策略优化(OSPO)，解决生成式搜索LLM中奖励稀疏和信用分配问题	reinforcement learning reward shaping large language model
24	The End of Reward Engineering: How LLMs Are Redefining Multi-Agent Coordination	利用LLM重定义多智能体协作，终结奖励工程难题	reinforcement learning Eureka large language model
25	TerraFormer: Automated Infrastructure-as-Code with LLMs Fine-Tuned via Policy-Guided Verifier Feedback	TerraFormer：利用策略引导的验证器反馈微调LLM，实现基础设施即代码的自动化生成。	reinforcement learning large language model
26	From Classical to Quantum Reinforcement Learning and Its Applications in Quantum Control: A Beginner's Tutorial	强化学习入门教程：从经典到量子，应用于量子控制	reinforcement learning
27	Creativity in AI as Emergence from Domain-Limited Generative Models	提出基于领域受限生成模型的AI创造力涌现框架，关注生成机制而非后验评估。	world model multimodal
28	Learning from Demonstrations via Capability-Aware Goal Sampling	提出能力感知目标采样(Cago)，提升模仿学习在长程稀疏奖励任务中的性能	imitation learning reward shaping
29	Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces	针对大动作空间Agentic LLM，提出稀疏策略学习框架SAC以保证多项式时间稳定性。	policy learning SAC

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
30	Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students	通过调研在线CS研究生，审计学生与AI在学术任务中的协作模式与偏好。	affordance

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
31	OpenMic: A Multi-Agent-Based Stand-Up Comedy Generation System	OpenMic：一个基于多智能体的中文单口喜剧生成系统	HuMoR

⬅️ 返回 cs.AI 首页 · 🏠 返回主页