cs.CV（2025-05-10）

📊 共 11 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱三：空间感知与语义 (Perception & Semantics) (4) 支柱九：具身大模型 (Embodied Foundation Models) (3 🔗2) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱二：RL算法与架构 (RL & Architecture) (1) 支柱四：生成式动作 (Generative Motion) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection	提出METOR框架，用于开放词汇视频视觉关系检测中的对象与关系互增强	open-vocabulary open vocabulary
2	Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation	提出因果提示校准的CPC-SAM模型，解决SAM在开放词汇多实体分割中的泛化问题。	open-vocabulary open vocabulary
3	Edge-Enabled VIO with Long-Tracked Features for High-Accuracy Low-Altitude IoT Navigation	提出基于长时跟踪特征的边缘VIO，提升低空IoT导航精度与实时性	VIO
4	ElectricSight: 3D Hazard Monitoring for Power Lines Using Low-Cost Sensors	ElectricSight：利用低成本传感器实现输电线路的3D危险监测	depth estimation monocular depth

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
5	Batch Augmentation with Unimodal Fine-tuning for Multimodal Learning	提出基于单模态微调的批量增强方法，用于多模态学习，提升超声图像胎儿器官检测性能。	large language model multimodal
6	TACFN: Transformer-based Adaptive Cross-modal Fusion Network for Multimodal Emotion Recognition	提出TACFN，利用Transformer自适应跨模态融合进行多模态情感识别	multimodal	✅
7	Improving Generalization of Medical Image Registration Foundation Model	融合SAM优化医学图像配准Foundation Model泛化性与鲁棒性	foundation model	✅

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images	提出GRACE，通过几何推理估计2D图像中人-场景交互的3D接触区域	SMPL embodied AI

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
9	Dataset Distillation with Probabilistic Latent Features	提出基于概率潜在特征的数据集蒸馏方法，提升跨架构泛化性能。	distillation

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
10	HDGlyph: A Hierarchical Disentangled Glyph-Based Framework for Long-Tail Text Rendering in Diffusion Models	HDGlyph：一种用于扩散模型中长尾文本渲染的分层解耦字形框架	classifier-free guidance

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images	ProFashion：利用多参考图像和原型引导的时尚视频生成框架	spatiotemporal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页