This subdirectory contains code to train the models presented in the manuscript.

- Data processing: we process the atomistic data into a format based on numpy's memory
  maps. The exact procedure varies by dataset, as different datasets can be obtained and
  are stored in different ways. The "data_processing/" folder contains the scripts that
  were used to process the OMat, sAlex, MPtrj and SPICE datasets.

- Installing metatrain: metatrain is the software used to run training for all models.
  It can be installed with `pip install` from the "metatrain/" directory contained here.
  Up-to-date installations of https://github.com/metatensor/metatomic and
  https://github.com/metatensor/metatensor are also required.

- Training yaml files: metatrain uses yaml files as inputs for training. For each
  training run, these are present inside the folder with the respective name.
