cs.CV（2024-10-26）

📊 共 5 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3) 支柱二：RL算法与架构 (RL & Architecture) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models	量化导致视觉-语言模型社会公平性偏差不一致：一项对比研究	foundation model
2	GiVE: Guiding Visual Encoder to Perceive Overlooked Information	提出GiVE以解决视觉编码器忽视信息的问题	large language model multimodal
3	Adaptive Video Understanding Agent: Enhancing efficiency with dynamic frame sampling and feedback-driven reasoning	提出自适应视频理解Agent，通过动态帧采样和反馈驱动推理提升效率。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
4	Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models	提出DIFFUSIONHOI，利用关系驱动的扩散模型提升人-物交互检测性能。	contrastive learning human-object interaction HOI

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
5	SCube: Instant Large-Scale Scene Reconstruction using VoxSplats	SCube：利用VoxSplats实现大规模场景的快速重建	scene reconstruction

⬅️ 返回 cs.CV 首页 · 🏠 返回主页