# Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits



## Dependencies
This repository supports Python 3.10.
- pandas==2.1.4
- numpy==1.26.3
- matplotlib==3.8.2
- scikit-learn==1.4.0
- scipy==1.11.4
- seaborn==0.13.1
- jupyterlab==4.0.10
- densratio==0.3.0
- torch==2.3.1

## How to build jupyter Lab
- ```docker compose up```(foreground)
- Access http://localhost:8888/

## Dataset
We used [KuaiRec](https://kuairec.com/) dataset in our experiments. To execute the experiments, please download small_matrix.csv and user_features.csv to the ~/data directory.


## Running the code
To conduct the experiment, run the following notebook.
| Figure | description | notebook |
| :--- | :--- | :--- |
| Figure 2 | (Real-World-Data) OPE with varying ratios of new actions in the target domain.| ./notebooks/real-world data/OPE/OPE_real-world-data_new_action_experiment.ipynb |
| Figure 3 |  (Real-World-Data) OPE with varying numbers of users whose logging is deterministic in the target domain. | ./notebooks/real-world data/OPE/OPE_real-world-data_deterministic_experiment.ipynb |
| Figure 4 |  (Real-World-Data) OPE with varying logged data sizes in the target domain. | ./notebooks/real-world data/OPE/OPE_real-world-data_sample_size_experiment.ipynb |
| Figure 5 |  (Real-World-Data) OPL methods under varying numbers of new actions. | ./notebooks/real-world data/OPL/OPL_real-world-data_new_action_experiment.ipynb |
| Figure 6(left) |  (Real-World-Data) OPL with varying numbers of users whose logging is deterministic in the target domain. | ./notebooks/real-world data/OPL/OPL_real-world-data_deterministic_experiment.ipynb |
| Figure 6(center) |  (Real-World-Data) OPL with varying training data sizes in the target domain | ./notebooks/real-world data/OPL/OPL_real-world-data_sample_size_experiment.ipynb |
| Figure 6(right) |  (Real-World-Data) OPL with varying size of the target cluster. | ./notebooks/real-world data/OPL/OPL_real-world-data_clustersize_experiment.ipynb |
| Figure 7 |  (Real-World-Data) OPE with varying size of the target cluster. | ./notebooks/real-world data/OPL/OPE_real-world-data_clustersize_experiment.ipynb |
| Figure 8 |  (Synthetic-Data) OPE with varying ratios of new actions in the target domain. | ./notebooks/syns data/OPE_syns-data_newaction_experiment.ipynb |
| Figure 9 |  (Synthetic-Data) OPE with varying percentile of samples with a deterministic logging. | ./notebooks/syns data/OPE_syns-data_deterministic_experiment.ipynb |
| Figure 10 |  (Synthetic-Data) OPE with varying logged data size in the target domain. | ./notebooks/syns data/OPE_syns-data_samplesize_experiment.ipynb |
| Table1 |   (Ablation) MSE in OPE for varying ratios of new actions in the target domain. | ./notebooks/real-world data/OPE/Ablation-OPE_real-world-data_context_distribution_experiment-newaction.ipynb |
| Table2 |   (Ablation) MSE in OPE for varying numbers of users with deterministic logging in the target domain. | ./notebooks/real-world data/OPE/Ablation-OPE_real-world-data_context_distribution_experiment-deterministic.ipynb |
| Table3 |   (Ablation) Test (Relative) Policy Value in OPL for varying ratios of new actions in the target domain. | ./notebooks/real-world data/OPL/Ablation-OPL_real-world-data_context_distribution_experiment-newaction.ipynb |
| Table4 |   (Ablation) Test (Relative) Policy Value in OPL for varying numbers of users with deterministic logging in the target domain. | ./notebooks/real-world data/OPL/Ablation-OPL_real-world-data_context_distribution_experiment-deterministic.ipynb |
| Table5 |   (Ablation) Test (Relative) Policy Value in OPL for varying λ values. | ./notebooks/real-world data/OPL/Ablation-OPL_real-world-data_context_distribution_experiment-lambda.ipynb |
