cs.CV(2024-11-16)

📊 共 3 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
1 BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices BlueLM-V-3B:面向移动设备的多模态大语言模型算法与系统协同设计 large language model multimodal
2 MTA: Multimodal Task Alignment for BEV Perception and Captioning 提出MTA多模态任务对齐框架,提升BEV感知和语义描述性能。 large language model multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
3 MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation MetricGold:利用文本到图像潜在扩散模型进行尺度深度估计 depth estimation monocular depth metric depth

⬅️ 返回 cs.CV 首页 · 🏠 返回主页