cs.CV（2024-05-10）

📊 共 20 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱三：空间感知与语义 (Perception & Semantics) (7 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (4 🔗1) 支柱四：生成式动作 (Generative Motion) (1) 支柱一：机器人控制 (Robot Control) (1) 支柱六：视频提取与匹配 (Video Extraction) (1) 支柱七：动作重定向 (Motion Retargeting) (1)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
1	I3DGS: Improve 3D Gaussian Splatting from Multiple Dimensions	I3DGS：从多维度改进3D高斯溅射的训练效率与性能	3D gaussian splatting gaussian splatting splatting
2	SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model	SAM3D：基于SAM的3D医学图像零样本半自动分割方法	sam 3D SAM 3D
3	Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions	提出基于LightGlue的鲁棒视觉SLAM系统，提升弱光和光照变化环境下的定位精度。	visual SLAM feature matching
4	MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization	MGS-SLAM：单目稀疏跟踪与高斯映射，结合深度平滑正则化，提升几何精度与跟踪能力	visual odometry 3D gaussian splatting gaussian splatting
5	Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial Rendering	Aerial-NeRF：针对大规模航拍场景的自适应空间划分与采样方法	NeRF neural radiance field	✅
6	OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation	OneTo3D：提出一种从单张图像生成可编辑动态3D模型和无限时长视频的方法。	gaussian splatting splatting neural radiance field
7	Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection	提出零样本不适定性程度估计，用于主动小物体变化检测，提升机器人室内导航能力。	open-vocabulary open vocabulary

🔬 支柱九：具身大模型 (Embodied Foundation Models) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
8	Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark	提出VNA基准测试，揭示多模态LLM在视觉网络分析任务上的不足	multimodal
9	Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach	提出一种基于多模态基础模型的端到端弱监督语义分割方法，提升分割边界精度。	foundation model
10	DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding	提出DARA以解决视觉定位中的参数高效调优问题	visual grounding	✅
11	Mesh Denoising Transformer	提出SurfaceFormer，一种基于Transformer的网格去噪框架，提升网格特征保持和全局结构理解能力。	multimodal
12	Decoding Emotions in Abstract Art: Cognitive Plausibility of CLIP in Recognizing Color-Emotion Associations	评估CLIP在抽象艺术情感识别中的认知合理性，揭示机器与人类情感理解的差异	multimodal

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
13	Deep video representation learning: a survey	深度视频表征学习综述：分析时空特征学习方法与挑战	representation learning spatiotemporal
14	Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection	提出基于注意力机制的熵蒸馏方法，用于提升多类别异常检测性能。	distillation feature matching
15	MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning	提出MaskMatch以解决半监督学习中的数据利用不足问题	representation learning masked autoencoder MAE
16	Novel Class Discovery for Ultra-Fine-Grained Visual Categorization	提出RAPL框架，解决超细粒度视觉分类中新类发现问题	representation learning contrastive learning	✅

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
17	Shape Conditioned Human Motion Generation with Diffusion Model	提出基于扩散模型的形状条件人体运动生成方法，直接生成网格运动序列。	motion diffusion model motion diffusion text-to-motion

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
18	Residual-NeRF: Learning Residual NeRFs for Transparent Object Manipulation	提出Residual-NeRF，提升透明物体操作场景下的深度感知与训练速度	manipulation MAE NeRF

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Comparative Analysis of Advanced Feature Matching Algorithms in Challenging High Spatial Resolution Optical Satellite Stereo Scenarios	针对高空间分辨率光学卫星影像，评估并优化特征匹配算法以提升配准精度。	feature matching

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Compression-Realized Deep Structural Network for Video Quality Enhancement	提出CRDS网络，利用压缩先验知识增强压缩视频质量。	motion estimation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页