cs.CV(2024-09-21)

📊 共 14 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (4 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱六:视频提取与匹配 (Video Extraction) (2) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
1 SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality SplatLoc:基于3D高斯溅射的增强现实视觉定位方法 3D gaussian splatting gaussian splatting splatting
2 MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors MOSE:利用NeRF提升的单目语义重建,解决单目图像三维场景理解难题 NeRF scene understanding
3 BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow 提出BurstM以解决多帧超分辨率中的对齐问题 optical flow
4 Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds 提出多边级联网络MCNet,用于大规模室外点云语义分割 scene understanding

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
5 Enhancing Advanced Visual Reasoning Ability of Large Language Models 提出CVR-LLM,增强大语言模型在复杂视觉推理任务中的能力 large language model multimodal
6 Foundation Models for Amodal Video Instance Segmentation in Automated Driving 提出S-AModal,利用Foundation Model解决自动驾驶中Amodal视频实例分割问题 foundation model
7 Vision-Language Models Assisted Unsupervised Video Anomaly Detection 提出VLAVAD,利用视觉-语言模型辅助无监督视频异常检测,在ShanghaiTech数据集上取得SOTA。 large language model
8 SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information 提出SURf框架,提升大型视觉语言模型对检索信息的选择性利用能力 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
9 CUS3D :CLIP-based Unsupervised 3D Segmentation via Object-level Denoise CUS3D:提出基于CLIP和对象级去噪的无监督3D语义分割方法 distillation open-vocabulary open vocabulary
10 BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance BrainDreamer:通过语言引导,从脑电信号生成推理连贯且可控的图像 dreamer contrastive learning
11 ECHO: Environmental Sound Classification with Hierarchical Ontology-guided Semi-Supervised Learning ECHO:利用层级本体引导的半监督学习进行环境声音分类 contrastive learning large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (2 篇)

#题目一句话要点标签🔗
12 PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture PoseAugment:提出基于物理约束的生成式人体姿态增强方法,提升IMU动作捕捉精度。 IMU-based motion human motion
13 Egocentric zone-aware action recognition across environments 提出区域感知动作识别方法,提升跨环境下的自中心视角动作识别性能 egocentric egocentric vision

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
14 ExFMan: Rendering 3D Dynamic Humans with Hybrid Monocular Blurry Frames and Events ExFMan:利用混合单目模糊帧和事件相机数据渲染动态3D人体 human motion

⬅️ 返回 cs.CV 首页 · 🏠 返回主页