GuideCAD: A Lightweight Multimodal Framework for 3D CAD Model Generation via Prefix Embedding

作者: Minseong Kim, Jinyeong Park, Sungho Park, Jibum Kim

分类: cs.CV

发布日期: 2026-06-05

🔗 代码/项目: GITHUB

💡 一句话要点

提出GuideCAD以解决3D CAD模型生成的计算资源问题

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

📋 核心要点

现有的多模态3D CAD生成方法通常需要大量的计算资源，导致训练效率低下。
GuideCAD通过映射网络将图像嵌入转换为前缀嵌入，结合预训练语言模型，整合视觉与文本信息。
实验结果显示，GuideCAD在参数使用上减少约四倍，同时训练效率提高了两倍，生成的模型质量依然高。
method_zh

📝 摘要（中文）

多模态方法用于3D CAD生成通常需要大量计算资源，训练效率低下。为此，本文提出了GuideCAD，利用语义丰富的视觉-文本表示，仅需少量可训练参数即可生成3D CAD模型。GuideCAD通过映射网络将图像嵌入转换为前缀嵌入，使得预训练的大型语言模型能够整合视觉和文本信息。最终，基于变换器的解码器利用视觉-文本嵌入预测构建顺序，从而生成3D CAD模型。实验中，我们构建了新的数据集GuideCAD，包含文本-图像对，结果表明GuideCAD在使用约四分之一参数的情况下，生成的3D CAD模型质量相当，并且训练效率提升了两倍。

🖼️ 关键图片

📄 摘要（原文）

Multi-modal approaches used for 3D CAD generation require substantial computational resources, necessitating efficient training. To address this, we propose GuideCAD, which leverages semantically rich visual-textual representations having only a small number of trainable parameters to generate 3D CAD models. Specifically, GuideCAD uses a mapping network that converts image embeddings into prefix embeddings, enabling a pretrained large language model to integrate visual and textual information. As a result, a transformer-based decoder predicts the construction sequence using the visual-textual embeddings in order to generate the 3D CAD model. For experimental evaluation, we construct a new dataset, referred to as GuideCAD, which consists of text-image pairs. Each pair includes a text prompt that represents a 3D CAD construction sequence and its corresponding 3D CAD image. Our experimental results show that GuideCAD generates comparably high-quality 3D CAD models while using approximately four times fewer parameters and achieving twice the training efficiency compared to fine-tuning approaches. We have released the source code and dataset for our method at: https://github.com/mskimS2/GuideCAD

GuideCAD: A Lightweight Multimodal Framework for 3D CAD Model Generation via Prefix Embedding

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🖼️ 关键图片

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理