This folder contains the code for running te DeGroot aggregation procedure and comparing it to reference schemes.
Code accompanies NeurIPS'21 submission 'Test-time collective prediction'

*synthetic-data-experiments.ipynb* 
code to run experiments in the synthetic setup. Comparison of DeGroot aggregation to model averaging (M-avg), and visualization the individual predictions and models.

*benchmark-script.ipynb* 
code to run DeGroot on different configurations (model, data). The provided version loads the Boston data and performs the benchmarking experiment from Table 1 using a ridge regression model.
For additional data the respective section of the code needs to be commented out and the path to the data needs to be specified.

*data_prep.py* 
helper function to generate synthetic setup

*ensemble.py* 
Ensemble class for building, training, and evaluating an ensemble of models.