# Label Propagation and Weak Supervision 

Our code for the ``Label Propagation and Weak Supervision'' project is contained within this directory! 

## Installation

We provide the requirements file for our environment; you can install the dependencies via:

'''
pip install -r requirements.txt
'''

## Data 

To replicate our experiments, you first need to download the data from the WRENCH benchmark (agnews, chemprot, imdb, yelp, youtube). These can be found at this link (given by Wrench): 
https://drive.google.com/drive/folders/1v55IKG2JN9fMtKJWU48B_5_DcPWGnpTq

For each dataset, you should place the downloaded files into their respective repositories (ex. 'datasets/sms', 'datasets/youtube', ...)

## Running our Code

To run our experiments, we first need to generate intermediate pseudolabels for our methods.

'''
python gen_pseudolabels.py --dataset youtube
'''

This will generate pseudolabels for all methods on the YouTube dataset and save them in the dataset's folder (i.e., 'datasets/youtube/').

Next, after generating the corresponding set of pseudolabels, we can train a dowstream model (end model) on these pseudolabels.

'''
python end_model.py
'''