cs.CV（2024-11-03）

📊 共 8 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (3 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (2) 支柱三：空间感知与语义 (Perception & Semantics) (2 🔗1) 支柱八：物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation	MedSora：结合光流对齐Mamba扩散模型，用于高质量医学视频生成	Mamba optical flow	✅
2	DreamPolish: Domain Score Distillation With Progressive Geometry Generation	DreamPolish：结合领域分数蒸馏与渐进几何生成的文本到3D模型生成方法	distillation classifier-free guidance
3	VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization	VQ-Map：利用向量量化在离散空间中进行鸟瞰图布局估计，刷新多项记录。	representation learning semantic map VQ-VAE	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
4	EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark	提出EEE-Bench，用于评估LMMs在电气电子工程问题上的能力，揭示其在复杂视觉信息处理上的局限性。	large language model foundation model multimodal
5	Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation	提出负样本挖掘的Mosaic数据增强NeMo，提升指代图像分割在复杂场景下的性能	multimodal

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Exploring PCA-based feature representations of image pixels via CNN to enhance food image segmentation	提出基于PCA的CNN特征表示方法，用于提升食物图像分割效果	open-vocabulary open vocabulary
7	Object segmentation from common fate: Motion energy processing enables human-like zero-shot generalization to random dot stimuli	基于运动能量处理，实现对随机点刺激的类人零样本目标分割	optical flow	✅

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing	提出FactorizePhys，利用矩阵分解实现rPPG中多维注意力机制，提升信号提取性能。	PULSE	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页