cs.CV(2024-12-25)

📊 共 10 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (5 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (4 🔗1) 支柱六:视频提取与匹配 (Video Extraction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
1 An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis 提出基于多模态视觉和语义信息的注意力双编码器框架,用于自动OSAHS诊断。 multimodal
2 MotionMap: Representing Multimodality in Human Pose Forecasting 提出MotionMap,通过热图高效表示人体姿态预测中的多模态性 multimodal
3 ObitoNet: Multimodal High-Resolution Point Cloud Reconstruction ObitoNet:利用跨注意力机制的多模态高分辨率点云重建 multimodal
4 ModelGrow: Continual Text-to-Video Pre-training with Model Expansion and Language Understanding Enhancement 提出ModelGrow,通过模型扩展和语言理解增强实现文本到视频的持续预训练。 large language model
5 Embodied Image Quality Assessment for Robotic Intelligence 提出MA-EIQA模型和EPD数据集,用于评估机器人生成内容(RGC)的图像质量。 embodied AI

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
6 WeatherGS: 3D Scene Reconstruction in Adverse Weather Conditions via Gaussian Splatting WeatherGS:基于高斯溅射的恶劣天气三维场景重建 3D gaussian splatting 3DGS gaussian splatting
7 Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model 提出OMTSeg,利用BEiT-3预训练模型实现开放词汇全景分割 open-vocabulary open vocabulary foundation model
8 FOR: Finetuning for Object Level Open Vocabulary Image Retrieval 提出FOR:微调CLIP模型用于物体级别开放词汇图像检索 open-vocabulary open vocabulary
9 ArtNVG: Content-Style Separated Artistic Neighboring-View Gaussian Stylization ArtNVG:提出内容-风格分离的艺术化邻域视图高斯风格化方法 3D gaussian splatting 3DGS gaussian splatting

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
10 Cross-PCR: A Robust Cross-Source Point Cloud Registration Framework 提出Cross-PCR框架,解决跨源点云配准中密度不一致和分布差异问题 feature matching

⬅️ 返回 cs.CV 首页 · 🏠 返回主页