# NOTES:

- the training curves are from an old stats tracking which may have had some bugs in the NLL. be careful if you make
  learning curves with these.

- MAML batchnorm works transductively so we need to avoid tracking stats and also avoid updating the learnable scale and
  shift parameters in the inner adaptation loops

- The original repository uses `n-query = n-train = n-way` for omniglot meta train and meta test time https://github.com/cbfinn/maml/blob/master/main.py#L238
- For miniimagenet, they use 15 test shots for training and train shots = test shots for testing https://github.com/cbfinn/maml/blob/master/main.py#L249

- The original repository uses `15-query` for MiniImageNet training, and `n-query = n-way` for testing. This seems
  arbitrary

- The stated xavier initializers don't work well for me, this has been reported by another author as well https://github.com/haebeom-lee/maml
  when I used xavier_normal, I got really bad overfitting on MiniImageNet first order and secon order. Validation
  accuracy never got above 33-35%. Once I tried the default Pytorch initializers, everything worked well.

- The original author used early stopping: https://github.com/cbfinn/maml/issues/8, we use early stopping for
  MiniIMageNet, but not for omniglot since Omniglot doesn't have a set validation set
