GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models

📄 arXiv: 2312.11243v2 📥 PDF

作者: Kuldeep R Barad, Andrej Orsula, Antoine Richard, Jan Dentler, Miguel Olivares-Mendez, Carol Martinez

分类: cs.RO

发布日期: 2023-12-18 (更新: 2024-11-23)

期刊: IEEE Access, vol. 12, pp. 164621-164633, 2024

DOI: 10.1109/ACCESS.2024.3492118

🔗 代码/项目: GITHUB


📄 摘要(原文)

Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM, a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric $SE(3)$ grasp poses conditioned on point clouds. GraspLDM architecture enables us to train task-specific models efficiently by only re-training a small denoising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and partial point clouds. GraspLDM models trained with simulation data transfer well to the real world without any further fine-tuning. Our models provide an 80% success rate for 80 grasp attempts of diverse test objects across two real-world robotic setups. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .