LLMOverTab: Tabular data augmentation with language model-driven oversampling

Published: 01 Jan 2025, Last Modified: 18 May 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Introduces LLMOverTab for oversampling in imbalanced tabular data.•Surpasses traditional methods like SMOTE and other LLM approaches.•Uses prompt engineering to generate meaningful synthetic instances.•Finds LLMOverTab excels especially with LLM prediction models.•Suggests exploring different LLM architectures and prompt techniques.
Loading