cs.CV（2025-01-04）

📊 共 8 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (2) 支柱二：RL算法与架构 (RL & Architecture) (2)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph	提出G-Prune以解决多模态大语言模型的视觉token冗余问题	large language model multimodal
2	Generating Multimodal Images with GAN: Integrating Text, Image, and Style	提出基于GAN的多模态图像生成方法，融合文本、图像和风格信息。	multimodal
3	A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges	对大型视觉语言模型（VLM）的对齐、基准、评估和挑战进行全面综述	multimodal	✅
4	Benchmarking Large and Small MLLMs	系统性评测大小型多模态大语言模型，揭示能力边界与应用潜力	multimodal

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Joint Optimization for 4D Human-Scene Reconstruction in the Wild	提出JOSH，用于野外场景单目视频中的4D人体-场景联合重建	scene reconstruction human-scene interaction human mesh recovery
6	From Images to Detection: Machine Learning for Blood Pattern Classification	提出基于机器学习的血迹模式分类方法，用于区分枪击和撞击血迹，提升犯罪现场重建效率。	scene reconstruction

🔬 支柱二：RL算法与架构 (RL & Architecture) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
7	Hyperbolic Contrastive Learning for Hierarchical 3D Point Cloud Embedding	提出基于双曲对比学习的层级3D点云嵌入方法，提升下游任务性能。	contrastive learning
8	Distillation-Enhanced Physical Adversarial Attacks	提出一种基于知识蒸馏的物理对抗攻击方法，提升隐蔽性和攻击性能。	distillation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页