Each analysis typically consists of two parts:

(1) Data Generation: Synthetic data generation from GPT with evaluation from Gemini
(2) DA: data augmentation and downstream analysis

Abstract dataset comes with two additional files to process the KAGGLE dataset and subsample = abstracts from statistics articles. 

The plotting functions are included in Scores_visualiation.ipynb