cs.CV（2025-02-28）

📊 共 8 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (3) 支柱三：空间感知与语义 (Perception & Semantics) (2) 支柱一：机器人控制 (Robot Control) (1 🔗1) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱九：具身大模型 (Embodied Foundation Models) (1 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	WorldModelBench: Judging Video Generation Models As World Models	提出WorldModelBench，用于评估视频生成模型作为世界模型的性能，尤其关注物理规律遵循。	world model instruction following
2	STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding	STPro：时空渐进学习框架，用于弱监督时空视频定位	curriculum learning foundation model
3	Dataset Distillation with Neural Characteristic Function: A Minmax Perspective	提出神经特征函数以解决数据集蒸馏中的分布匹配问题	distillation

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
4	EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering	EndoPBR：通过物理渲染进行逼真手术模拟的材质和光照估计	depth estimation 3D gaussian splatting gaussian splatting
5	RTGen: Real-Time Generative Detection Transformer	提出RTGen：一种实时生成式检测Transformer，解决开放词汇目标检测的速度瓶颈。	open-vocabulary open vocabulary

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Towards General Visual-Linguistic Face Forgery Detection(V2)	提出FFTG，通过伪造掩码和提示策略提升视觉-语言人脸伪造检测的准确性。	manipulation large language model multimodal	✅

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching	提出EDM，一种面向全景图像的密集核化特征匹配算法，显著提升匹配精度。	feature matching

🔬 支柱九：具身大模型 (Embodied Foundation Models) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	SciceVPR: Stable Cross-Image Correlation Enhanced Model for Visual Place Recognition	SciceVPR：稳定跨图像相关增强的视觉定位模型	foundation model	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页