## This is the code of our paper "Learning the Optimal Policy for Balancing Multiple Short-Term and Long-Term Rewards"

## Run the code

The "data_generateJOBS" and "data_generateIHDP" files include the data generation process for JOBS and IHDP datasets, respectively. 

We put our final data in the data folder.

To recover our experimental results, please refer to main.py.


## Acknowledgments
We follow the previous study, which is shown below:  
- Identifying assumptions and missing mechanism assumptions in causal inference.
- Estimation and identification techniques for long and short-term effects.
- Utilizing semiparametric theory to obtain effective long- and short-term reward estimation
- Multi-objective optimization techniques based on preference vector decomposition.
- MGDA algorithm for finding the descent direction of a multi-objective problem.
