cs.CV(2025-01-19)
📊 共 7 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱三:空间感知与语义 (Perception & Semantics) (2)
支柱二:RL算法与架构 (RL & Architecture) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Transfer Learning Strategies for Pathological Foundation Models: A Systematic Evaluation in Brain Tumor Classification | 针对脑肿瘤分类,论文提出病理学Foundation Model迁移学习策略评估方案。 | foundation model | ||
| 2 | Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding | 提出金字塔下降视觉位置编码(PyPE),提升视觉语言模型的多粒度感知能力 | multimodal | ✅ | |
| 3 | Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation | 提出基于早融合策略的EFNet,用于低照度下的高效多模态图像分割 | multimodal | ||
| 4 | Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP | 提出NegationCLIP,通过数据驱动增强CLIP模型对否定概念的理解能力 | large language model multimodal |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering | RDG-GS:基于高斯溅射和相对深度引导的实时稀疏视角3D渲染 | depth estimation monocular depth 3D gaussian splatting | ||
| 6 | Unit Region Encoding: A Unified and Compact Geometry-aware Representation for Floorplan Applications | 提出单元区域编码,用于统一紧凑的几何感知室内平面图表示,适用于多种平面图应用。 | semantic map |
🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition | 提出DecomposeWHAR模型,有效分解融合多传感器时空信号,提升可穿戴人体活动识别精度。 | SSM state space model |