## DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
### Code for DIPPER

This is a tensorflow based implementation for our approach DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning.

### 1) Running demo
To run the demo, use the following scripts:
  ```Shell
  # For Maze navigation environment
  python experiment/play.py --dir=maze_dipper_0 --render=1 --rollouts=10

  # For Pick and place environment
  python experiment/play.py --dir=pick_dipper_0 --render=1 --rollouts=10

  # For push environment
  python experiment/play.py --dir=push_dipper_0 --render=1 --rollouts=10

  # For Franka kitchen environment
  python experiment/play.py --dir=kitchen_dipper_0 --render=1 --rollouts=10

  ```

### 2) Training code
To train, use the following scripts. For baselines, change the parameters accordingly:
  ```Shell
  # For Maze navigation environment
  python experiment/train.py --env="FetchMazeReach-v1" --logdir="maze_dipper_0" --n_epochs=6000 --reward_batch_size=50 --seed=0 --bc_loss=0 --num_hrl_layers=2 --dipper=1 --q_reg=1

  # For Pick and place environment
  python experiment/train.py --env="FetchPickAndPlace-v1" --logdir="pick_dipper_0" --n_epochs=18000 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --dipper=1 --q_reg=1

  # For push manipulation environment
  python experiment/train.py --env="FetchPickAndPlace-v1" --logdir="push_dipper_0" --n_epochs=15500 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --dipper=1 --q_reg=1

  # For Franka kitchen environment
  python experiment/train.py --env="kitchen-complete-v0" --logdir="kitchen_dipper_0" --n_epochs=2800 --reward_batch_size=50 --seed=0 --bc_loss=1 --num_hrl_layers=2 --dipper=1 --q_reg=1

  ```

### 3) Plot progress
To plot the success rate performances, use the following scripts:
  ```Shell
  # For Maze navigation environment
  python experiment/plot.py --dir1=maze_dipper_0:dipper --plot_name="maze"

  # For Pick and place environment
  python experiment/plot.py --dir1=pick_dipper_0:dipper --plot_name="pick"

  # For push environment
  python experiment/plot.py --dir1=push_dipper_0:dipper --plot_name="push"

  # For Franka kitchen environment
  python experiment/plot.py --dir1=kitchen_dipper_0:dipper --plot_name="kitchen"
  
  ```
