cs.CV(2025-01-19)

📊 共 7 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱三:空间感知与语义 (Perception & Semantics) (2) 支柱二:RL算法与架构 (RL & Architecture) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
1 Transfer Learning Strategies for Pathological Foundation Models: A Systematic Evaluation in Brain Tumor Classification 针对脑肿瘤分类,论文提出病理学Foundation Model迁移学习策略评估方案。 foundation model
2 Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding 提出金字塔下降视觉位置编码(PyPE),提升视觉语言模型的多粒度感知能力 multimodal
3 Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation 提出基于早融合策略的EFNet,用于低照度下的高效多模态图像分割 multimodal
4 Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP 提出NegationCLIP,通过数据驱动增强CLIP模型对否定概念的理解能力 large language model multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)

#题目一句话要点标签🔗
5 RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering RDG-GS:基于高斯溅射和相对深度引导的实时稀疏视角3D渲染 depth estimation monocular depth 3D gaussian splatting
6 Unit Region Encoding: A Unified and Compact Geometry-aware Representation for Floorplan Applications 提出单元区域编码,用于统一紧凑的几何感知室内平面图表示,适用于多种平面图应用。 semantic map

🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)

#题目一句话要点标签🔗
7 Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition 提出DecomposeWHAR模型,有效分解融合多传感器时空信号,提升可穿戴人体活动识别精度。 SSM state space model

⬅️ 返回 cs.CV 首页 · 🏠 返回主页