This code repo is used for the implementation of UPA which includes:

1. Use pre-trained CLIP to extract features of CIFAR-10/100 and SVHN:
(1) Download the clip-vit-base-patch32 model;
(2) Prepare the datasets and store them in the 'data' folder;
(3) Run get_cifar_feature.py and get_svhn_feature.py to extract features of CIFAR-10/100 and SVHN datasets respectively. The extracted features will be saved as *.pkl files.

2. Select samples based on extracted features:
(1) Run UPA.py to select samples;
(2) The indices of the selected samples are saved as *.txt files.

3. Visualization of UPA (the implementation of Section 4.5.1):
(1) Run generate_visualization_data.py to generate data points;
(2) Run visualization.py to visualize the 3 sampling methods, including: random, stratified and UPA.