The attached Coach packages contains all changes we made for the implementation of REPAINT. The main files we have changed include:
1. ./agents/agent.py
2. ./agents/clipped_ppo_agent.py
3. ./architectures/architecture.py
4. ./architectures/network_wrapper.py
5. ./architectures/tensorflow_components/architecture.py
6. ./architectures/tensorflow_components/general_network.py
7. ./architectures/tensorflow_components/layers.py
8. ./architectures/tensorflow_components/savers.py
9. ./architectures/tensorflow_components/heads/ppo_head.py
10. ./presets/Mujoco_ClippedPPO.py
11. ./base_parameters.py
12. ./saver.py
13. ./training_worker.py
14. ./rollout_worker.py
15. ./coach.py

For example, when a teacher policy for MuJoCo-Ant is existed, one can run the following commands in two terminals to conduct REPAINT:

$ coach -r -p Mujoco_ClippedPPO -lvl ant -tcd '<path-to-the-teacher-policy>' -dc --memory_backend_params '{"redis_address": "localhost", "redis_port": 6379, "store_type": "redispubsub", "channel": "channel-123"}' -asc -e ant_REPAINT_rollout --distributed_coach_run_type rollout-worker

$ coach -r -p Mujoco_ClippedPPO -lvl ant -tcd '<path-to-the-teacher-policy>' -dc --memory_backend_params '{"redis_address": "localhost", "redis_port": 6379, "store_type": "redispubsub", "channel": "channel-123"}' -asc -e ant_REPAINT_trainer --distributed_coach_run_type trainer
