## Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
### Code for HPO

This is a tensorflow based implementation for our approach Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction.

### 1) Running demo
To run the demo, use the following scripts:
  ```Shell
  # For Maze navigation environment
  python experiment/play.py --dir=maze_hpo_0 --render=1 --rollouts=10

  # For Pick and place environment
  python experiment/play.py --dir=pick_hpo_0 --render=1 --rollouts=10

  # For push environment
  python experiment/play.py --dir=push_hpo_0 --render=1 --rollouts=10

  # For Franka kitchen environment
  python experiment/play.py --dir=kitchen_hpo_0 --render=1 --rollouts=10

  ```

### 2) Training code
To train, use the following scripts. For baselines, change the parameters accordingly:
  ```Shell
  # For Maze navigation environment
  python experiment/train.py --env="FetchMazeReach-v1" --logdir="maze_hpo_0" --n_epochs=6000 --reward_batch_size=50 --seed=0 --bc_loss=0 --num_hrl_layers=2 --hpo=1 --q_reg=1

  # For Pick and place environment
  python experiment/train.py --env="FetchPickAndPlace-v1" --logdir="pick_hpo_0" --n_epochs=18000 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --hpo=1 --q_reg=1

  # For push manipulation environment
  python experiment/train.py --env="FetchPickAndPlace-v1" --logdir="push_hpo_0" --n_epochs=15500 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --hpo=1 --q_reg=1

  # For Franka kitchen environment
  python experiment/train.py --env="kitchen-complete-v0" --logdir="kitchen_hpo_0" --n_epochs=2800 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --hpo=1 --q_reg=1

  ```

### 3) Plot progress
To plot the success rate performances, use the following scripts:
  ```Shell
  # For Maze navigation environment
  python experiment/plot.py --dir1=maze_hpo_0:hpo --plot_name="maze"

  # For Pick and place environment
  python experiment/plot.py --dir1=pick_hpo_0:hpo --plot_name="pick"

  # For push environment
  python experiment/plot.py --dir1=push_hpo_0:hpo --plot_name="push"

  # For Franka kitchen environment
  python experiment/plot.py --dir1=kitchen_hpo_0:hpo --plot_name="kitchen"
  
  ```
