cs.CV（2026-01-02）

📊 共 8 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (4) 支柱四：生成式动作 (Generative Motion) (1 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (1 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Grading Handwritten Engineering Exams with Multimodal Large Language Models	提出多模态大语言模型以解决手写工程考试评分问题	large language model multimodal
2	Investigating the Viability of Employing Multi-modal Large Language Models in the Context of Audio Deepfake Detection	探索多模态大语言模型在音频深度伪造检测中的可行性	large language model multimodal
3	Modality Dominance-Aware Optimization for Embodied RGB-Infrared Perception	提出模态支配感知优化框架，解决具身RGB-IR感知中的模态不对称问题。	multimodal
4	AEGIS: Exploring the Limit of World Knowledge Capabilities for Unified Mulitmodal Models	AEGIS：探索统一多模态模型世界知识能力的极限	multimodal

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation	提出SafeMo以解决文本到运动生成中的安全性问题	text-to-motion motion generation VQ-VAE	✅

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction	AdaGaR：自适应Gabor表示用于动态场景重建，提升细节捕捉与时间连续性。	depth estimation scene reconstruction	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	HyperPriv-EPN: Hypergraph Learning with Privileged Knowledge for Ependymoma Prognosis	HyperPriv-EPN：利用特权知识的超图学习用于室管膜瘤预后	distillation privileged information multimodal

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	DynaDrag: Dynamic Drag-Style Image Editing by Motion Prediction	DynaDrag：基于运动预测的动态拖拽式图像编辑方法	manipulation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页