cs.CV(2025-01-20)

📊 共 12 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2) 支柱三:空间感知与语义 (Perception & Semantics) (3) 支柱二:RL算法与架构 (RL & Architecture) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution 提出DeQA-Score模型,利用大语言模型回归精确的图像质量评分,并解决数据集差异问题。 large language model
2 MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching MIFNet:学习模态不变特征,用于可泛化的多模态图像匹配 multimodal
3 A generalizable 3D framework and model for self-supervised learning in medical imaging 提出3DINO框架与3DINO-ViT模型,用于医学影像自监督学习,提升通用性和可扩展性。 foundation model multimodal
4 KPL: Training-Free Medical Knowledge Mining of Vision-Language Models 提出KPL:一种免训练的医学视觉-语言模型知识挖掘方法 large language model multimodal
5 A Review Paper of the Effects of Distinct Modalities and ML Techniques to Distracted Driving Detection 综述论文:分析多模态数据与机器学习技术在驾驶员分心检测中的应用与效果 multimodal
6 Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models Eagle2:从零构建后训练数据策略,提升前沿视觉-语言模型性能 multimodal
7 MASS: Overcoming Language Bias in Image-Text Matching 提出多模态关联评分(MASS)框架,克服图像-文本匹配中的语言偏见。 multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (3 篇)

#题目一句话要点标签🔗
8 See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization 提出局部深度和语义正则化的稀疏视角3D高斯溅射方法,提升渲染质量。 3D gaussian splatting 3DGS gaussian splatting
9 Dynamic Scene Understanding from Vision-Language Representations 利用视觉-语言表征进行动态场景理解,无需大量任务特定工程。 scene understanding human-object interaction
10 Event-based vision for egomotion estimation using precise event timing 提出基于精确事件时间信息的事件相机运动估计方法,适用于低功耗机器人应用。 optical flow motion tracking

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
11 EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery EndoChat:用于内窥镜手术的具身多模态大型语言模型 representation learning scene understanding large language model
12 DEFEND: A Large-scale 1M Dataset and Foundation Model for Tobacco Addiction Prevention 提出 Tobacco-1M 数据集与 DEFEND 烟草成瘾预防基础模型,提升烟草产品监管能力。 representation learning foundation model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页