# Code for Offline Equilibrium Finding in Extensive-form Games: Datasets, Theory and Methods.


# Dataset

The dataset in this repository includes three types of datasets for every game: random dataset, expert dataset and learning dataset. 
The hybrid dataset can be get from the random dataset and expert dataset. 
Here, we didnot inclue the dataset due to the limited file size and we will release the dataset once the paper is accepted. 
But we provide the code used to generate dataset in **generate_dataset** folder, therefore, the offline dataset can be get using the code. 
The data entry in dataset is [current_game_state, player_id, legal_actions, action, next_game_state, next_legal_actions, next_player, reward, done, chance_node]. 

- **current_game_state**: a list of every player's infomation state list
- **player_id**: the player should take actions at curretn game state
- **legal_actions**: the available action set of current game state
- **action**: the selected action
- **next_game_state**: a list of every player's infomation state list after excuting the action
- **next_legal_actions**: the available action set of next game state
- **next_player**: the player should take actions at next game state
- **reward**: the rewards of every player of next game state
- **done**: whether the next game state is a end state
- **chance_node**: whether the next game state is a chance state

# How to run the code
- Create a virtual python environment
- Install the requirement packages in the `requirements.txt`
- **Behavior Cloning Algorithm**: run the `train_bc_policy.py` file to get the behavior cloning policy by modifying the game and corresponding dataset in that file

- **Model-based Algorithm**: first run the `train_env_model.py` file to train the environment model by modifying the game and corresponding dataset in that file and then run MB-CFR, MB-PSRO or MB-JPSRO algorithm to get the model-based policy based on the trained environment model
- **MB-CFR**: run the `run_mb_deep_cfr.py` file in the **mb_cfr** folder to run MB-CFR algorithm on the trained environment model
- **MB-PSRO**: run the `run_mb_psro.py` file in the **mb_psro** folder to run MB-PSRO algorithm on the trained environment model
- **MB-JSPRO**: run the `run_mb_jpsro.py`file in the **mb_jpsro** folder to tun MB-JPSRO algorithm on the trained environment model

- **BOMB Algorithm**: there are three methods to determine the parameter \alpha. The first one is random method which is very simple so we do not provide it here. The other two method are shown **bomb** folder. To run them, directly run the `bomb_with_grid_search.py`file and `bomb_train_predictor.py`file. 