A note on the contents of this folder.

**Query sets**
Files containing the queries with ground-truth and prompts (now including the control queries) can be found in:
- query_set_supplemental_info_ICLR_7B.csv
- query_set_supplemental_info_ICLR_35B.csv

**Pretraining documents**
We release pretraining documents containing the answers to factual and reasoning queries, as well as example documents containing procedures on calculating the slopes, in the following csv:
- answer_and_slope_pretraining_data.csv
This file contains details on which query the document contains an answer to in each row if relevant, and otherwise it says "slopes questions", which means there is no answer to a query but it's a document containing procedural knowledge for calculating the slope between two points (as referred to in the paper finding 4).

We are working on releasing the top and bottom 20 documents for each query, which requires getting it approved internally for release and takes a while longer.

**Correlation internal implementation of EK-FAC with public implementation**
Related section in paper: Appendix A.2.1
The results of the experiments comparing influence scores calculated internally with a public implementation can be found in:

- public_implementation_correlation_experiment_correlations.txt

**Correlation with full implementation of approximations done for EK-FAC estimation**
Related section in paper: Appendix A.2.2
The results of experiments justifying the approximations we make can be found in the files:

- approximation_correlation_experiment_*.txt

**Query correlations**
As mentioned in the paper, we release all query correlations here, in the file:

- all_query_correlations.json

For each pair of queries, we calculated 9 different types of correlation, of which we only used the "all scores correlation" in the paper.
The file has the following structure: {models: <query_idx_one>_<query_idx_two>: [all scores correlation, 
                                                                                 [top 50k of query one scores correlation, top 50k of query two scores correlation],
                                                                                 [bottom 50k of query one scores correlation, bottom 50k of query two scores correlation],
                                                                                 [positive scores of query one scores correlation, positive scores of query two scores correlation],
                                                                                 [negative scores of query one scores correlation, negative scores of query two scores correlation]
                                                                                ]}

For the heatmap of correlations used in the paper, we threw out all correlations in the .json file with a p-value above 0.05, and plotted the mean. We also have a separate CSV file with these values for convenience:

- all_correlation_*.csv
- all_correlation_pvals_*.csv
