cs.CV(2024-05-10)

📊 共 20 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (7 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗1) 支柱四:生成式动作 (Generative Motion) (1) 支柱一:机器人控制 (Robot Control) (1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (7 篇)

#题目一句话要点标签🔗
1 I3DGS: Improve 3D Gaussian Splatting from Multiple Dimensions I3DGS:从多维度改进3D高斯溅射的训练效率与性能 3D gaussian splatting gaussian splatting splatting
2 SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model SAM3D:基于SAM的3D医学图像零样本半自动分割方法 sam 3D SAM 3D
3 Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions 提出基于LightGlue的鲁棒视觉SLAM系统,提升弱光和光照变化环境下的定位精度。 visual SLAM feature matching
4 MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization MGS-SLAM:单目稀疏跟踪与高斯映射,结合深度平滑正则化,提升几何精度与跟踪能力 visual odometry 3D gaussian splatting gaussian splatting
5 Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial Rendering Aerial-NeRF:针对大规模航拍场景的自适应空间划分与采样方法 NeRF neural radiance field
6 OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation OneTo3D:提出一种从单张图像生成可编辑动态3D模型和无限时长视频的方法。 gaussian splatting splatting neural radiance field
7 Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection 提出零样本不适定性程度估计,用于主动小物体变化检测,提升机器人室内导航能力。 open-vocabulary open vocabulary

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
8 Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark 提出VNA基准测试,揭示多模态LLM在视觉网络分析任务上的不足 multimodal
9 Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach 提出一种基于多模态基础模型的端到端弱监督语义分割方法,提升分割边界精度。 foundation model
10 DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding 提出DARA以解决视觉定位中的参数高效调优问题 visual grounding
11 Mesh Denoising Transformer 提出SurfaceFormer,一种基于Transformer的网格去噪框架,提升网格特征保持和全局结构理解能力。 multimodal
12 Decoding Emotions in Abstract Art: Cognitive Plausibility of CLIP in Recognizing Color-Emotion Associations 评估CLIP在抽象艺术情感识别中的认知合理性,揭示机器与人类情感理解的差异 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
13 Deep video representation learning: a survey 深度视频表征学习综述:分析时空特征学习方法与挑战 representation learning spatiotemporal
14 Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection 提出基于注意力机制的熵蒸馏方法,用于提升多类别异常检测性能。 distillation feature matching
15 MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning 提出MaskMatch以解决半监督学习中的数据利用不足问题 representation learning masked autoencoder MAE
16 Novel Class Discovery for Ultra-Fine-Grained Visual Categorization 提出RAPL框架,解决超细粒度视觉分类中新类发现问题 representation learning contrastive learning

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
17 Shape Conditioned Human Motion Generation with Diffusion Model 提出基于扩散模型的形状条件人体运动生成方法,直接生成网格运动序列。 motion diffusion model motion diffusion text-to-motion

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
18 Residual-NeRF: Learning Residual NeRFs for Transparent Object Manipulation 提出Residual-NeRF,提升透明物体操作场景下的深度感知与训练速度 manipulation MAE NeRF

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
19 Comparative Analysis of Advanced Feature Matching Algorithms in Challenging High Spatial Resolution Optical Satellite Stereo Scenarios 针对高空间分辨率光学卫星影像,评估并优化特征匹配算法以提升配准精度。 feature matching

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
20 Compression-Realized Deep Structural Network for Video Quality Enhancement 提出CRDS网络,利用压缩先验知识增强压缩视频质量。 motion estimation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页