Keywords: UNIPOINT-LLM: UNIFYING POINT CLOUD UNDER- STANDING AND MORE CONTROLLABLE 3D GENERA- TION
Abstract: We introduce UniPoint-LLM, which integrates point cloud understanding ability
into Image Multimodal Large Language Model(MLLM) and enable more flexible
and controllable natural language-driven 3D generation, realized the unified pro-
cess of point cloud understanding and generation. Unlike traditional text-to-3D
methods with limited prompt inputs or constrained parameters, UniPoint-LLM al-
lows users to input natural language description to specify their requirements. By
aligning image and point cloud modalities through joint training and weights shar-
ing, UniPoint-LLM also achieves two modalities’ understanding. Experiments
demonstrate that UniPoint-LLM offers users greater flexibility and control in gen-
erating desired 3D objects and the effectiveness of our Multimodal Universal To-
ken Space(MUTS) in understanding both images and point clouds. These exper-
iments validates its potential value in practical applications of 3D generation and
interactive desig
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3392
Loading