cs.CV(2025-01-04)

📊 共 8 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱二:RL算法与架构 (RL & Architecture) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
1 What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph 提出G-Prune以解决多模态大语言模型的视觉token冗余问题 large language model multimodal
2 Generating Multimodal Images with GAN: Integrating Text, Image, and Style 提出基于GAN的多模态图像生成方法,融合文本、图像和风格信息。 multimodal
3 A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges 对大型视觉语言模型(VLM)的对齐、基准、评估和挑战进行全面综述 multimodal
4 Benchmarking Large and Small MLLMs 系统性评测大小型多模态大语言模型,揭示能力边界与应用潜力 multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
5 Joint Optimization for 4D Human-Scene Reconstruction in the Wild 提出JOSH,用于野外场景单目视频中的4D人体-场景联合重建 scene reconstruction human-scene interaction human mesh recovery
6 From Images to Detection: Machine Learning for Blood Pattern Classification 提出基于机器学习的血迹模式分类方法,用于区分枪击和撞击血迹,提升犯罪现场重建效率。 scene reconstruction

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
7 Hyperbolic Contrastive Learning for Hierarchical 3D Point Cloud Embedding 提出基于双曲对比学习的层级3D点云嵌入方法,提升下游任务性能。 contrastive learning
8 Distillation-Enhanced Physical Adversarial Attacks 提出一种基于知识蒸馏的物理对抗攻击方法,提升隐蔽性和攻击性能。 distillation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页