cs.CV(2026-01-02)

📊 共 8 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (4) 支柱四:生成式动作 (Generative Motion) (1 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
1 Grading Handwritten Engineering Exams with Multimodal Large Language Models 提出多模态大语言模型以解决手写工程考试评分问题 large language model multimodal
2 Investigating the Viability of Employing Multi-modal Large Language Models in the Context of Audio Deepfake Detection 探索多模态大语言模型在音频深度伪造检测中的可行性 large language model multimodal
3 Modality Dominance-Aware Optimization for Embodied RGB-Infrared Perception 提出模态支配感知优化框架,解决具身RGB-IR感知中的模态不对称问题。 multimodal
4 AEGIS: Exploring the Limit of World Knowledge Capabilities for Unified Mulitmodal Models AEGIS:探索统一多模态模型世界知识能力的极限 multimodal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
5 SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation 提出SafeMo以解决文本到运动生成中的安全性问题 text-to-motion motion generation VQ-VAE

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
6 AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction AdaGaR:自适应Gabor表示用于动态场景重建,提升细节捕捉与时间连续性。 depth estimation scene reconstruction

🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)

#题目一句话要点标签🔗
7 HyperPriv-EPN: Hypergraph Learning with Privileged Knowledge for Ependymoma Prognosis HyperPriv-EPN:利用特权知识的超图学习用于室管膜瘤预后 distillation privileged information multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
8 DynaDrag: Dynamic Drag-Style Image Editing by Motion Prediction DynaDrag:基于运动预测的动态拖拽式图像编辑方法 manipulation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页