In order to run all the codes, the following environment is required: 

Gurobi (LP solver) education/pro version; Python 3.8; PyTorch 1.5.1; graphviz 0.14.1; gym 0.17.2; numpy 1.18.5; pandas 1.0.5; scipy 1.4.1; matplotlib 3.3.0; tensorflow 2.3.0.

Solveoccupancymeasurebinary and Solveoccupancymeasuremulti are the solver for occupancy measures in the main paper, for binary and multi action setting.

Calculatebinaryactionregret calculate the regret for binary action settings, with different algorithms. 
In order to get the multi action regret, n_action for Calculatebinaryactionregret should be changed to 3/5/10. 

Two case study are named as wirelessschdulingcasestudy and healthcasestudy.

The optimality gap is calculated with the regret/number of arms.

A more detailed, organized github repository will be provided once past the anonymous stage.