This is a python program performing Algorithm 1 in our paper.

Dependency Libraries: pandas, numpy, scipy, cupy

Hardware Requirements: We use a machine with 500 GB RAM and a Nvidia A100 GPU. The GPU has 80 GB RAM.
To perform the MovieLens 20M example below, make sure the memory of your GPU is large enough to perform the eigenvalue decomposition of a 30000 x 30000 matrix in double precision.

Examples:

Here we use the MovieLens 20M dataset as an example to show how to run our program.

1. Download the MovieLens 20M dataset from https://files.grouplens.org/datasets/movielens/ml-20m.zip, put the ml-20m.zip in the path ./data/ and extract it.

2. Go to ./data/ML20M1/, run data\_processing.py, then the program will convert ./data/ml-20m/ratings.csv into a binary matrix and stored in ratings\_processed.csv, which corresponds to the matrix R in our paper.

3. Run data\_split2.py, then it will split ratings\_processed.csv into ratings\_processed\_train.csv(R_{train} in our paper), ratings\_processed\_test1.csv (X), ratings\_processed\_test1.csv (Y).

4. Set the Line 8 of matrix1.py to the absolute path of accelib1.py on your computer. Run matrix1.py, it will precompute the matrices Sxx (Sigma_{xx}), Syy (Sigma_{yy}), XX (XX^T), YX (XY^T), Q\_h (Sigma_{xx}^{1/2}), B (Sigma_{xy}^T Sigma_{xx}^{-1/2}) and B1 (Sigma_{xy}^T Sigma_{xx}^{-1}) and store them in the folder ./data/ML20M1/matrix1/. These matrices will later be used by bound2\_acc6.py. 

* accelib1.py is a library we wrote to perform linear algebra algorithms for large matrices on GPU. The basic idea is to split a large matrix into blocks. These blocks are within the limit GPU memory and are processed by GPU sequentially.

5. Go to ./data/ and run ease\_acc.py. It will generate the LAE model by EASE using the training set ratings\_processed\_train.csv, and store the model in ./data/model2/. The regularization parameter is set to 50 by default.

6. Run bound2\_acc6\_3.py. This is the main program to compute the bound. 

After finishing the above steps, you should be able to get the result of ML20M in Table 1 of our paper. (We set the initial value of lambda as 512 in bound2\_acc6\_3.py, corresponding to the setting in Table 3 of our paper, so the result will be shown in the first epoch).

7. The Recall@50 and NDCG@100 of the corresponding model can be computed by running ./rank/test1.py

Running time:

Our code has been highly optimized for efficient computation for large datasets and models. Below is the running time of the code based on our platform.
ML20M     4 minutes for preprocessing, then 2 minutes for each lambda
Netflix   4 minutes for preprocessing, then 1.5 minutes for each lambda
MSD       15 minutes for preprocessing, then 5 minutes for each lambda
