ICLR '23: Transformers & Mixture Models
----------------------------------------

Overview: 

- simulate.py: runs a training loop of a model without restrictions on the 
number of samples used (each iteration is fresh samples)
- simulate-fixed_sample_size.py: runs a training loop with restrictions on 
the number of samples used (each iteration is a subsample of a fixed dataset)
- analyze.py: analyzes a trained transformer, and computes the output of SA, EM algs on the same mixture model. 

Src directory: 
- em.py: implements batch EM algo
- sa.py: implements the subspace algorithm (Alg 1) in Jain et al. (see paper)
- models.py: implements transformer model 
- data.py: data generator for mixture models
- util.py: some utility code (model loading and saving, etc.)
- train.py: training loops for models