Installation

pip install torch torchvision pytorch-lightning wandb

⸻

Usage

Pretraining with CLOP:
python main.py --epochs 100 --batch_size 256 --dataset cifar100 --loss nxt_ent --has_CLOP --devices 4

Argument	Description	Example
--epochs	Number of training epochs	100
--batch_size	Batch size per device	256
--dataset	Dataset name	cifar100
--loss	Loss function	nxt_ent
--has_CLOP	Enable CLOP loss	--has_CLOP
--lambda_val	CLOP loss weight	1.0
--devices	Number of devices (GPUs)	4
--distance	Similarity measure (cosine, euclidean)	cosine
--class_per_file	File path for class-level sampling percentages	"splits/cifar100_partial.txt"
--label_por	Proportion of labeled data (for semi-supervised setting)	0.25
--etf	Enable ETF initialization	--etf
--semi	Enable semi-supervised mode	--semi


⸻

Evaluation

Linear Evaluation:
python main.py --eval --pretrain_dir path/to/model.ckpt --batch_size 256 --epochs 100 --dataset cifar100 --task linear

Object Detection Evaluation:
python main.py --eval --pretrain_dir path/to/model.ckpt --batch_size 128 --epochs 50 --dataset voc2007 --task detection


⸻

Class Imbalance

To simulate class imbalance, use --class_per_file with a file containing sampling percentages.
Format:

0:0.5,1:1.0,2:0.3,...

⸻

Example: Semi-supervised Pretraining with Class Imbalance

python main.py --epochs 200 \
               --batch_size 128 \
               --dataset cifar100 \
               --has_CLOP \
               --loss nxt_ent \
               --lambda_val 0.5 \
               --class_per_file configs/class_ratio.txt \
               --label_por 0.25 \
               --devices 4 \
               --semi


⸻

Extract Data (Preprocessing)

To preprocess a dataset without training:

python main.py --extract_data --dataset cifar100


⸻

Logging

This project uses Weights & Biases for training and evaluation logging. Make sure to log in before running:

wandb login
