Text2Data: Low-Resource Data Generation with Textual Control

Published: 05 Mar 2024, Last Modified: 12 May 2024PML4LRS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: low resource, text-to-data generation
TL;DR: We propose a method that can achieve text-to-data generation under low-resource situation.
Abstract: The machine learning community has been investing considerable effort in generating data that is semantically coherent with textual instructions. Nevertheless, low-resource areas characterized by expensive annotations or complex data structures, such as molecules, motion dynamics and time series, often lack textual labels. This deficiency impedes supervised learning, thereby constraining the application of advanced generative models for text-to-data tasks. In response to these challenges, we propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model and then undergoes controllable finetuning via a novel constraint optimization-based learning objective to ensure controllability. Comprehensive experiments demonstrate that Text2Data is able to achieve enhanced performance regarding controllability across various modalities, including molecules, motions and time series, when compared to existing baselines.
Submission Number: 14
Loading