wandb: Currently logged in as: 804703098. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.16.5
wandb: Run data is saved locally in /home/user/zhangyang/PycharmProjects/Nips2024-ITPC-v2/Nips2024-ITPC-v2/onpolicy/scripts/results/MPE/simple_tag_tr/rmappotrsyn/exp_train_continue_tag_base_kl_s2r2_v1/wandb/run-20240402_144955-mulbjpbw
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run MPE_1
wandb: ⭐️ View project at https://wandb.ai/804703098/Continue_Tag_Base_v1
wandb: 🚀 View run at https://wandb.ai/804703098/Continue_Tag_Base_v1/runs/mulbjpbw/workspace
choose to use gpu...
idv policy and team policy use same initial params!

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 0/10000 episodes, total num timesteps 200/2000000, FPS 143.

team_policy eval average step individual rewards of agent0: 0.08872775963695964
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: 0.026611773474305035
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: 0.013930608891753593
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.006822761986278989
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: -0.025136502182662238
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.11332629381743445
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.081715955864065
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: 0.018157102651989002
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.05633408994170927
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.048461036855224994
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1/10000 episodes, total num timesteps 400/2000000, FPS 145.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 2/10000 episodes, total num timesteps 600/2000000, FPS 151.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 3/10000 episodes, total num timesteps 800/2000000, FPS 158.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 4/10000 episodes, total num timesteps 1000/2000000, FPS 161.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 5/10000 episodes, total num timesteps 1200/2000000, FPS 164.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 6/10000 episodes, total num timesteps 1400/2000000, FPS 166.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 7/10000 episodes, total num timesteps 1600/2000000, FPS 168.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 8/10000 episodes, total num timesteps 1800/2000000, FPS 170.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 9/10000 episodes, total num timesteps 2000/2000000, FPS 170.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 10/10000 episodes, total num timesteps 2200/2000000, FPS 170.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 11/10000 episodes, total num timesteps 2400/2000000, FPS 172.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 12/10000 episodes, total num timesteps 2600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 13/10000 episodes, total num timesteps 2800/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 14/10000 episodes, total num timesteps 3000/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 15/10000 episodes, total num timesteps 3200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 16/10000 episodes, total num timesteps 3400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 17/10000 episodes, total num timesteps 3600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 18/10000 episodes, total num timesteps 3800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 19/10000 episodes, total num timesteps 4000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 20/10000 episodes, total num timesteps 4200/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 21/10000 episodes, total num timesteps 4400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 22/10000 episodes, total num timesteps 4600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 23/10000 episodes, total num timesteps 4800/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 24/10000 episodes, total num timesteps 5000/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 25/10000 episodes, total num timesteps 5200/2000000, FPS 175.

team_policy eval average step individual rewards of agent0: -0.06758952750384795
team_policy eval average team episode rewards of agent0: 2.5
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent1: 0.06388093598339498
team_policy eval average team episode rewards of agent1: 2.5
team_policy eval idv catch total num of agent1: 6
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent2: -0.07742835598431715
team_policy eval average team episode rewards of agent2: 2.5
team_policy eval idv catch total num of agent2: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent3: -0.07578476120435468
team_policy eval average team episode rewards of agent3: 2.5
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent4: 0.00587607524466379
team_policy eval average team episode rewards of agent4: 2.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent0: -0.1446825815381882
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.03700666153927161
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.13758530350945178
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.0635929879418777
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.12093074122578891
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 0
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 26/10000 episodes, total num timesteps 5400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 27/10000 episodes, total num timesteps 5600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 28/10000 episodes, total num timesteps 5800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 29/10000 episodes, total num timesteps 6000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 30/10000 episodes, total num timesteps 6200/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 31/10000 episodes, total num timesteps 6400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 32/10000 episodes, total num timesteps 6600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 33/10000 episodes, total num timesteps 6800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 34/10000 episodes, total num timesteps 7000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 35/10000 episodes, total num timesteps 7200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 36/10000 episodes, total num timesteps 7400/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 37/10000 episodes, total num timesteps 7600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 38/10000 episodes, total num timesteps 7800/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 39/10000 episodes, total num timesteps 8000/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 40/10000 episodes, total num timesteps 8200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 41/10000 episodes, total num timesteps 8400/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 42/10000 episodes, total num timesteps 8600/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 43/10000 episodes, total num timesteps 8800/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 44/10000 episodes, total num timesteps 9000/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 45/10000 episodes, total num timesteps 9200/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 46/10000 episodes, total num timesteps 9400/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 47/10000 episodes, total num timesteps 9600/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 48/10000 episodes, total num timesteps 9800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 49/10000 episodes, total num timesteps 10000/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 50/10000 episodes, total num timesteps 10200/2000000, FPS 176.

team_policy eval average step individual rewards of agent0: -0.1081746702522665
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: -0.09103800660286236
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: -0.12721345750529134
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.12661295434899292
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: -0.05747294428099046
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.06278201404578919
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.09630113014574898
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.15874065601385662
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.13622941880676034
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.12917993207568043
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 51/10000 episodes, total num timesteps 10400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 52/10000 episodes, total num timesteps 10600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 53/10000 episodes, total num timesteps 10800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 54/10000 episodes, total num timesteps 11000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 55/10000 episodes, total num timesteps 11200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 56/10000 episodes, total num timesteps 11400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 57/10000 episodes, total num timesteps 11600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 58/10000 episodes, total num timesteps 11800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 59/10000 episodes, total num timesteps 12000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 60/10000 episodes, total num timesteps 12200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 61/10000 episodes, total num timesteps 12400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 62/10000 episodes, total num timesteps 12600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 63/10000 episodes, total num timesteps 12800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 64/10000 episodes, total num timesteps 13000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 65/10000 episodes, total num timesteps 13200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 66/10000 episodes, total num timesteps 13400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 67/10000 episodes, total num timesteps 13600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 68/10000 episodes, total num timesteps 13800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 69/10000 episodes, total num timesteps 14000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 70/10000 episodes, total num timesteps 14200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 71/10000 episodes, total num timesteps 14400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 72/10000 episodes, total num timesteps 14600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 73/10000 episodes, total num timesteps 14800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 74/10000 episodes, total num timesteps 15000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 75/10000 episodes, total num timesteps 15200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: -0.08088476090815205
team_policy eval average team episode rewards of agent0: 2.5
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent1: -0.09656092198822525
team_policy eval average team episode rewards of agent1: 2.5
team_policy eval idv catch total num of agent1: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent2: -0.057941951131147286
team_policy eval average team episode rewards of agent2: 2.5
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent3: -0.08826487567404417
team_policy eval average team episode rewards of agent3: 2.5
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent4: -0.005923945823299009
team_policy eval average team episode rewards of agent4: 2.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent0: -0.04092099023092492
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: 0.09223928542243334
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.033051234700465185
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.06775333812411737
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.12383127971171744
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 0
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 76/10000 episodes, total num timesteps 15400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 77/10000 episodes, total num timesteps 15600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 78/10000 episodes, total num timesteps 15800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 79/10000 episodes, total num timesteps 16000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 80/10000 episodes, total num timesteps 16200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 81/10000 episodes, total num timesteps 16400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 82/10000 episodes, total num timesteps 16600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 83/10000 episodes, total num timesteps 16800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 84/10000 episodes, total num timesteps 17000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 85/10000 episodes, total num timesteps 17200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 86/10000 episodes, total num timesteps 17400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 87/10000 episodes, total num timesteps 17600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 88/10000 episodes, total num timesteps 17800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 89/10000 episodes, total num timesteps 18000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 90/10000 episodes, total num timesteps 18200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 91/10000 episodes, total num timesteps 18400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 92/10000 episodes, total num timesteps 18600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 93/10000 episodes, total num timesteps 18800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 94/10000 episodes, total num timesteps 19000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 95/10000 episodes, total num timesteps 19200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 96/10000 episodes, total num timesteps 19400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 97/10000 episodes, total num timesteps 19600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 98/10000 episodes, total num timesteps 19800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 99/10000 episodes, total num timesteps 20000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 100/10000 episodes, total num timesteps 20200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.006269639489057248
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: -0.015641545568599056
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.1959517591228712
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.09203973034224106
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.18612595994822304
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: -0.06074509532715198
idv_policy eval average team episode rewards of agent0: 2.5
idv_policy eval idv catch total num of agent0: 1
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent1: -0.013599493334957415
idv_policy eval average team episode rewards of agent1: 2.5
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent2: -0.06794960618886373
idv_policy eval average team episode rewards of agent2: 2.5
idv_policy eval idv catch total num of agent2: 1
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent3: -0.05930057067294021
idv_policy eval average team episode rewards of agent3: 2.5
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent4: -0.06544321167831409
idv_policy eval average team episode rewards of agent4: 2.5
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 1

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 101/10000 episodes, total num timesteps 20400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 102/10000 episodes, total num timesteps 20600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 103/10000 episodes, total num timesteps 20800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 104/10000 episodes, total num timesteps 21000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 105/10000 episodes, total num timesteps 21200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 106/10000 episodes, total num timesteps 21400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 107/10000 episodes, total num timesteps 21600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 108/10000 episodes, total num timesteps 21800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 109/10000 episodes, total num timesteps 22000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 110/10000 episodes, total num timesteps 22200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 111/10000 episodes, total num timesteps 22400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 112/10000 episodes, total num timesteps 22600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 113/10000 episodes, total num timesteps 22800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 114/10000 episodes, total num timesteps 23000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 115/10000 episodes, total num timesteps 23200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 116/10000 episodes, total num timesteps 23400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 117/10000 episodes, total num timesteps 23600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 118/10000 episodes, total num timesteps 23800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 119/10000 episodes, total num timesteps 24000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 120/10000 episodes, total num timesteps 24200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 121/10000 episodes, total num timesteps 24400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 122/10000 episodes, total num timesteps 24600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 123/10000 episodes, total num timesteps 24800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 124/10000 episodes, total num timesteps 25000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 125/10000 episodes, total num timesteps 25200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.024165077792895718
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 4
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: 0.02610606087003875
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 4
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: -0.048824500904949816
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: -0.0774107678150029
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 0
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: -0.027408882728931036
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: -0.027894542719904884
idv_policy eval average team episode rewards of agent0: 17.5
idv_policy eval idv catch total num of agent0: 2
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent1: 0.07001907868046463
idv_policy eval average team episode rewards of agent1: 17.5
idv_policy eval idv catch total num of agent1: 6
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent2: 0.17336857678107065
idv_policy eval average team episode rewards of agent2: 17.5
idv_policy eval idv catch total num of agent2: 10
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent3: 0.07715393017628876
idv_policy eval average team episode rewards of agent3: 17.5
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent4: 0.02622309272130481
idv_policy eval average team episode rewards of agent4: 17.5
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 7

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 126/10000 episodes, total num timesteps 25400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 127/10000 episodes, total num timesteps 25600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 128/10000 episodes, total num timesteps 25800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 129/10000 episodes, total num timesteps 26000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 130/10000 episodes, total num timesteps 26200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 131/10000 episodes, total num timesteps 26400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 132/10000 episodes, total num timesteps 26600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 133/10000 episodes, total num timesteps 26800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 134/10000 episodes, total num timesteps 27000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 135/10000 episodes, total num timesteps 27200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 136/10000 episodes, total num timesteps 27400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 137/10000 episodes, total num timesteps 27600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 138/10000 episodes, total num timesteps 27800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 139/10000 episodes, total num timesteps 28000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 140/10000 episodes, total num timesteps 28200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 141/10000 episodes, total num timesteps 28400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 142/10000 episodes, total num timesteps 28600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 143/10000 episodes, total num timesteps 28800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 144/10000 episodes, total num timesteps 29000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 145/10000 episodes, total num timesteps 29200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 146/10000 episodes, total num timesteps 29400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 147/10000 episodes, total num timesteps 29600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 148/10000 episodes, total num timesteps 29800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 149/10000 episodes, total num timesteps 30000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 150/10000 episodes, total num timesteps 30200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.18526843842119084
team_policy eval average team episode rewards of agent0: 17.5
team_policy eval idv catch total num of agent0: 10
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent1: 0.21215535344003128
team_policy eval average team episode rewards of agent1: 17.5
team_policy eval idv catch total num of agent1: 11
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent2: 0.0890004826721976
team_policy eval average team episode rewards of agent2: 17.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent3: 0.03756355295835363
team_policy eval average team episode rewards of agent3: 17.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent4: 0.08986066013279843
team_policy eval average team episode rewards of agent4: 17.5
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent0: 0.10469183838547684
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.0007978918617770559
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: -0.05502662118959696
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 1
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.02898516344448479
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.07900924494368614
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 151/10000 episodes, total num timesteps 30400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 152/10000 episodes, total num timesteps 30600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 153/10000 episodes, total num timesteps 30800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 154/10000 episodes, total num timesteps 31000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 155/10000 episodes, total num timesteps 31200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 156/10000 episodes, total num timesteps 31400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 157/10000 episodes, total num timesteps 31600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 158/10000 episodes, total num timesteps 31800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 159/10000 episodes, total num timesteps 32000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 160/10000 episodes, total num timesteps 32200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 161/10000 episodes, total num timesteps 32400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 162/10000 episodes, total num timesteps 32600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 163/10000 episodes, total num timesteps 32800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 164/10000 episodes, total num timesteps 33000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 165/10000 episodes, total num timesteps 33200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 166/10000 episodes, total num timesteps 33400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 167/10000 episodes, total num timesteps 33600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 168/10000 episodes, total num timesteps 33800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 169/10000 episodes, total num timesteps 34000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 170/10000 episodes, total num timesteps 34200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 171/10000 episodes, total num timesteps 34400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 172/10000 episodes, total num timesteps 34600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 173/10000 episodes, total num timesteps 34800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 174/10000 episodes, total num timesteps 35000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 175/10000 episodes, total num timesteps 35200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.135261452769677
team_policy eval average team episode rewards of agent0: 27.5
team_policy eval idv catch total num of agent0: 8
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent1: 0.10996286094872325
team_policy eval average team episode rewards of agent1: 27.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent2: 0.03625100485407576
team_policy eval average team episode rewards of agent2: 27.5
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent3: 0.18813611111379328
team_policy eval average team episode rewards of agent3: 27.5
team_policy eval idv catch total num of agent3: 10
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent4: 0.21277445210504223
team_policy eval average team episode rewards of agent4: 27.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent0: 0.007543829603823819
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.08836867080956651
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.08470706646666137
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.00470081143520863
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.18861309303246362
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 10
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 176/10000 episodes, total num timesteps 35400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 177/10000 episodes, total num timesteps 35600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 178/10000 episodes, total num timesteps 35800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 179/10000 episodes, total num timesteps 36000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 180/10000 episodes, total num timesteps 36200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 181/10000 episodes, total num timesteps 36400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 182/10000 episodes, total num timesteps 36600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 183/10000 episodes, total num timesteps 36800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 184/10000 episodes, total num timesteps 37000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 185/10000 episodes, total num timesteps 37200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 186/10000 episodes, total num timesteps 37400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 187/10000 episodes, total num timesteps 37600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 188/10000 episodes, total num timesteps 37800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 189/10000 episodes, total num timesteps 38000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 190/10000 episodes, total num timesteps 38200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 191/10000 episodes, total num timesteps 38400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 192/10000 episodes, total num timesteps 38600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 193/10000 episodes, total num timesteps 38800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 194/10000 episodes, total num timesteps 39000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 195/10000 episodes, total num timesteps 39200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 196/10000 episodes, total num timesteps 39400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 197/10000 episodes, total num timesteps 39600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 198/10000 episodes, total num timesteps 39800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 199/10000 episodes, total num timesteps 40000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 200/10000 episodes, total num timesteps 40200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08823562263506946
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.24519163772764382
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 12
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.19197888238691102
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.21420505166780301
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.14459827436695524
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 9
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.21353052028414915
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.030513283352932144
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: -0.04127551208295278
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 1
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.04238142746046646
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.05546509485676696
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 201/10000 episodes, total num timesteps 40400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 202/10000 episodes, total num timesteps 40600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 203/10000 episodes, total num timesteps 40800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 204/10000 episodes, total num timesteps 41000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 205/10000 episodes, total num timesteps 41200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 206/10000 episodes, total num timesteps 41400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 207/10000 episodes, total num timesteps 41600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 208/10000 episodes, total num timesteps 41800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 209/10000 episodes, total num timesteps 42000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 210/10000 episodes, total num timesteps 42200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 211/10000 episodes, total num timesteps 42400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 212/10000 episodes, total num timesteps 42600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 213/10000 episodes, total num timesteps 42800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 214/10000 episodes, total num timesteps 43000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 215/10000 episodes, total num timesteps 43200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 216/10000 episodes, total num timesteps 43400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 217/10000 episodes, total num timesteps 43600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 218/10000 episodes, total num timesteps 43800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 219/10000 episodes, total num timesteps 44000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 220/10000 episodes, total num timesteps 44200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 221/10000 episodes, total num timesteps 44400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 222/10000 episodes, total num timesteps 44600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 223/10000 episodes, total num timesteps 44800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 224/10000 episodes, total num timesteps 45000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 225/10000 episodes, total num timesteps 45200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: -0.006457654752864615
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: 0.014393295719126988
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: -0.00866093807494141
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: -0.031770637043362006
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: 0.018444143520799566
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: 0.1923463070437133
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 10
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: -0.005802557512847728
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 2
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.2440794279478831
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 12
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.3207854526828391
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 15
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.14513583873561003
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 226/10000 episodes, total num timesteps 45400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 227/10000 episodes, total num timesteps 45600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 228/10000 episodes, total num timesteps 45800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 229/10000 episodes, total num timesteps 46000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 230/10000 episodes, total num timesteps 46200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 231/10000 episodes, total num timesteps 46400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 232/10000 episodes, total num timesteps 46600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 233/10000 episodes, total num timesteps 46800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 234/10000 episodes, total num timesteps 47000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 235/10000 episodes, total num timesteps 47200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 236/10000 episodes, total num timesteps 47400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 237/10000 episodes, total num timesteps 47600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 238/10000 episodes, total num timesteps 47800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 239/10000 episodes, total num timesteps 48000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 240/10000 episodes, total num timesteps 48200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 241/10000 episodes, total num timesteps 48400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 242/10000 episodes, total num timesteps 48600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 243/10000 episodes, total num timesteps 48800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 244/10000 episodes, total num timesteps 49000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 245/10000 episodes, total num timesteps 49200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 246/10000 episodes, total num timesteps 49400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 247/10000 episodes, total num timesteps 49600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 248/10000 episodes, total num timesteps 49800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 249/10000 episodes, total num timesteps 50000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 250/10000 episodes, total num timesteps 50200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.27477978067932207
team_policy eval average team episode rewards of agent0: 17.5
team_policy eval idv catch total num of agent0: 13
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent1: 0.04331881413970877
team_policy eval average team episode rewards of agent1: 17.5
team_policy eval idv catch total num of agent1: 4
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent2: 0.14389284873577246
team_policy eval average team episode rewards of agent2: 17.5
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent3: 0.19488140020927197
team_policy eval average team episode rewards of agent3: 17.5
team_policy eval idv catch total num of agent3: 10
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent4: 0.06895300827511237
team_policy eval average team episode rewards of agent4: 17.5
team_policy eval idv catch total num of agent4: 5
team_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent0: 0.0407023741565782
idv_policy eval average team episode rewards of agent0: 22.5
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent1: 0.252747529463657
idv_policy eval average team episode rewards of agent1: 22.5
idv_policy eval idv catch total num of agent1: 12
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent2: 0.27172862287624133
idv_policy eval average team episode rewards of agent2: 22.5
idv_policy eval idv catch total num of agent2: 13
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent3: 0.2239864981709131
idv_policy eval average team episode rewards of agent3: 22.5
idv_policy eval idv catch total num of agent3: 11
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent4: -0.009250736575486198
idv_policy eval average team episode rewards of agent4: 22.5
idv_policy eval idv catch total num of agent4: 2
idv_policy eval team catch total num: 9

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 251/10000 episodes, total num timesteps 50400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 252/10000 episodes, total num timesteps 50600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 253/10000 episodes, total num timesteps 50800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 254/10000 episodes, total num timesteps 51000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 255/10000 episodes, total num timesteps 51200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 256/10000 episodes, total num timesteps 51400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 257/10000 episodes, total num timesteps 51600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 258/10000 episodes, total num timesteps 51800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 259/10000 episodes, total num timesteps 52000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 260/10000 episodes, total num timesteps 52200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 261/10000 episodes, total num timesteps 52400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 262/10000 episodes, total num timesteps 52600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 263/10000 episodes, total num timesteps 52800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 264/10000 episodes, total num timesteps 53000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 265/10000 episodes, total num timesteps 53200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 266/10000 episodes, total num timesteps 53400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 267/10000 episodes, total num timesteps 53600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 268/10000 episodes, total num timesteps 53800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 269/10000 episodes, total num timesteps 54000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 270/10000 episodes, total num timesteps 54200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 271/10000 episodes, total num timesteps 54400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 272/10000 episodes, total num timesteps 54600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 273/10000 episodes, total num timesteps 54800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 274/10000 episodes, total num timesteps 55000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 275/10000 episodes, total num timesteps 55200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.18559576977655062
team_policy eval average team episode rewards of agent0: 30.0
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent1: 0.26017842230942617
team_policy eval average team episode rewards of agent1: 30.0
team_policy eval idv catch total num of agent1: 12
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent2: 0.029945330503072482
team_policy eval average team episode rewards of agent2: 30.0
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent3: 0.35960113795349813
team_policy eval average team episode rewards of agent3: 30.0
team_policy eval idv catch total num of agent3: 16
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent4: 0.10629246566850646
team_policy eval average team episode rewards of agent4: 30.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent0: 0.2305199205328911
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.3029492649564491
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 14
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.20565369313998436
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 10
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.17770207764027138
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 9
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.15007185076392138
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 276/10000 episodes, total num timesteps 55400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 277/10000 episodes, total num timesteps 55600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 278/10000 episodes, total num timesteps 55800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 279/10000 episodes, total num timesteps 56000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 280/10000 episodes, total num timesteps 56200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 281/10000 episodes, total num timesteps 56400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 282/10000 episodes, total num timesteps 56600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 283/10000 episodes, total num timesteps 56800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 284/10000 episodes, total num timesteps 57000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 285/10000 episodes, total num timesteps 57200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 286/10000 episodes, total num timesteps 57400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 287/10000 episodes, total num timesteps 57600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 288/10000 episodes, total num timesteps 57800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 289/10000 episodes, total num timesteps 58000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 290/10000 episodes, total num timesteps 58200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 291/10000 episodes, total num timesteps 58400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 292/10000 episodes, total num timesteps 58600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 293/10000 episodes, total num timesteps 58800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 294/10000 episodes, total num timesteps 59000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 295/10000 episodes, total num timesteps 59200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 296/10000 episodes, total num timesteps 59400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 297/10000 episodes, total num timesteps 59600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 298/10000 episodes, total num timesteps 59800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 299/10000 episodes, total num timesteps 60000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 300/10000 episodes, total num timesteps 60200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.010616783853113433
team_policy eval average team episode rewards of agent0: 25.0
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent1: 0.09219624783038925
team_policy eval average team episode rewards of agent1: 25.0
team_policy eval idv catch total num of agent1: 6
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent2: 0.3942909661642277
team_policy eval average team episode rewards of agent2: 25.0
team_policy eval idv catch total num of agent2: 18
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent3: 0.16420745976443676
team_policy eval average team episode rewards of agent3: 25.0
team_policy eval idv catch total num of agent3: 9
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent4: 0.30043542228563436
team_policy eval average team episode rewards of agent4: 25.0
team_policy eval idv catch total num of agent4: 14
team_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent0: 0.11810324930131273
idv_policy eval average team episode rewards of agent0: 7.5
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent1: 0.14235449115249113
idv_policy eval average team episode rewards of agent1: 7.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent2: 0.0073839925770520344
idv_policy eval average team episode rewards of agent2: 7.5
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent3: 0.13791467659114956
idv_policy eval average team episode rewards of agent3: 7.5
idv_policy eval idv catch total num of agent3: 8
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent4: 0.062205807266210354
idv_policy eval average team episode rewards of agent4: 7.5
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 3

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 301/10000 episodes, total num timesteps 60400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 302/10000 episodes, total num timesteps 60600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 303/10000 episodes, total num timesteps 60800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 304/10000 episodes, total num timesteps 61000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 305/10000 episodes, total num timesteps 61200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 306/10000 episodes, total num timesteps 61400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 307/10000 episodes, total num timesteps 61600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 308/10000 episodes, total num timesteps 61800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 309/10000 episodes, total num timesteps 62000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 310/10000 episodes, total num timesteps 62200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 311/10000 episodes, total num timesteps 62400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 312/10000 episodes, total num timesteps 62600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 313/10000 episodes, total num timesteps 62800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 314/10000 episodes, total num timesteps 63000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 315/10000 episodes, total num timesteps 63200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 316/10000 episodes, total num timesteps 63400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 317/10000 episodes, total num timesteps 63600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 318/10000 episodes, total num timesteps 63800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 319/10000 episodes, total num timesteps 64000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 320/10000 episodes, total num timesteps 64200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 321/10000 episodes, total num timesteps 64400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 322/10000 episodes, total num timesteps 64600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 323/10000 episodes, total num timesteps 64800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 324/10000 episodes, total num timesteps 65000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 325/10000 episodes, total num timesteps 65200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.29760198842350044
team_policy eval average team episode rewards of agent0: 30.0
team_policy eval idv catch total num of agent0: 14
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent1: 0.3284883980800697
team_policy eval average team episode rewards of agent1: 30.0
team_policy eval idv catch total num of agent1: 15
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent2: 0.17375117379406607
team_policy eval average team episode rewards of agent2: 30.0
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent3: 0.2213887493596193
team_policy eval average team episode rewards of agent3: 30.0
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent4: 0.19759137251637468
team_policy eval average team episode rewards of agent4: 30.0
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent0: 0.22606567454409907
idv_policy eval average team episode rewards of agent0: 22.5
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent1: 0.2563713458408767
idv_policy eval average team episode rewards of agent1: 22.5
idv_policy eval idv catch total num of agent1: 12
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent2: 0.2539704506883636
idv_policy eval average team episode rewards of agent2: 22.5
idv_policy eval idv catch total num of agent2: 12
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent3: 0.09349536930839288
idv_policy eval average team episode rewards of agent3: 22.5
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent4: 0.1026184379510051
idv_policy eval average team episode rewards of agent4: 22.5
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 9

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 326/10000 episodes, total num timesteps 65400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 327/10000 episodes, total num timesteps 65600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 328/10000 episodes, total num timesteps 65800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 329/10000 episodes, total num timesteps 66000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 330/10000 episodes, total num timesteps 66200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 331/10000 episodes, total num timesteps 66400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 332/10000 episodes, total num timesteps 66600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 333/10000 episodes, total num timesteps 66800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 334/10000 episodes, total num timesteps 67000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 335/10000 episodes, total num timesteps 67200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 336/10000 episodes, total num timesteps 67400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 337/10000 episodes, total num timesteps 67600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 338/10000 episodes, total num timesteps 67800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 339/10000 episodes, total num timesteps 68000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 340/10000 episodes, total num timesteps 68200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 341/10000 episodes, total num timesteps 68400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 342/10000 episodes, total num timesteps 68600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 343/10000 episodes, total num timesteps 68800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 344/10000 episodes, total num timesteps 69000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 345/10000 episodes, total num timesteps 69200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 346/10000 episodes, total num timesteps 69400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 347/10000 episodes, total num timesteps 69600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 348/10000 episodes, total num timesteps 69800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 349/10000 episodes, total num timesteps 70000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 350/10000 episodes, total num timesteps 70200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: -0.010228110776864235
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.14062926534046374
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.060938211944741916
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.08820655627498819
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.09716534055550838
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.2070098651576179
idv_policy eval average team episode rewards of agent0: 40.0
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 16
idv_policy eval average step individual rewards of agent1: 0.16703031847831804
idv_policy eval average team episode rewards of agent1: 40.0
idv_policy eval idv catch total num of agent1: 9
idv_policy eval team catch total num: 16
idv_policy eval average step individual rewards of agent2: 0.4701375802173615
idv_policy eval average team episode rewards of agent2: 40.0
idv_policy eval idv catch total num of agent2: 21
idv_policy eval team catch total num: 16
idv_policy eval average step individual rewards of agent3: 0.3018040885546594
idv_policy eval average team episode rewards of agent3: 40.0
idv_policy eval idv catch total num of agent3: 14
idv_policy eval team catch total num: 16
idv_policy eval average step individual rewards of agent4: 0.2501224333791602
idv_policy eval average team episode rewards of agent4: 40.0
idv_policy eval idv catch total num of agent4: 12
idv_policy eval team catch total num: 16

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 351/10000 episodes, total num timesteps 70400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 352/10000 episodes, total num timesteps 70600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 353/10000 episodes, total num timesteps 70800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 354/10000 episodes, total num timesteps 71000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 355/10000 episodes, total num timesteps 71200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 356/10000 episodes, total num timesteps 71400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 357/10000 episodes, total num timesteps 71600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 358/10000 episodes, total num timesteps 71800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 359/10000 episodes, total num timesteps 72000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 360/10000 episodes, total num timesteps 72200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 361/10000 episodes, total num timesteps 72400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 362/10000 episodes, total num timesteps 72600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 363/10000 episodes, total num timesteps 72800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 364/10000 episodes, total num timesteps 73000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 365/10000 episodes, total num timesteps 73200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 366/10000 episodes, total num timesteps 73400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 367/10000 episodes, total num timesteps 73600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 368/10000 episodes, total num timesteps 73800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 369/10000 episodes, total num timesteps 74000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 370/10000 episodes, total num timesteps 74200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 371/10000 episodes, total num timesteps 74400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 372/10000 episodes, total num timesteps 74600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 373/10000 episodes, total num timesteps 74800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 374/10000 episodes, total num timesteps 75000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 375/10000 episodes, total num timesteps 75200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.01456629558925171
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.011039423589160267
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.13928306163664622
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.1345317109417363
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: 0.21679828868569298
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: 0.35323182585110824
idv_policy eval average team episode rewards of agent0: 47.5
idv_policy eval idv catch total num of agent0: 16
idv_policy eval team catch total num: 19
idv_policy eval average step individual rewards of agent1: 0.3271164082533592
idv_policy eval average team episode rewards of agent1: 47.5
idv_policy eval idv catch total num of agent1: 15
idv_policy eval team catch total num: 19
idv_policy eval average step individual rewards of agent2: 0.4336452439344314
idv_policy eval average team episode rewards of agent2: 47.5
idv_policy eval idv catch total num of agent2: 19
idv_policy eval team catch total num: 19
idv_policy eval average step individual rewards of agent3: 0.34647917756860236
idv_policy eval average team episode rewards of agent3: 47.5
idv_policy eval idv catch total num of agent3: 16
idv_policy eval team catch total num: 19
idv_policy eval average step individual rewards of agent4: 0.4606693087797315
idv_policy eval average team episode rewards of agent4: 47.5
idv_policy eval idv catch total num of agent4: 20
idv_policy eval team catch total num: 19

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 376/10000 episodes, total num timesteps 75400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 377/10000 episodes, total num timesteps 75600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 378/10000 episodes, total num timesteps 75800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 379/10000 episodes, total num timesteps 76000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 380/10000 episodes, total num timesteps 76200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 381/10000 episodes, total num timesteps 76400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 382/10000 episodes, total num timesteps 76600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 383/10000 episodes, total num timesteps 76800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 384/10000 episodes, total num timesteps 77000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 385/10000 episodes, total num timesteps 77200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 386/10000 episodes, total num timesteps 77400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 387/10000 episodes, total num timesteps 77600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 388/10000 episodes, total num timesteps 77800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 389/10000 episodes, total num timesteps 78000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 390/10000 episodes, total num timesteps 78200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 391/10000 episodes, total num timesteps 78400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 392/10000 episodes, total num timesteps 78600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 393/10000 episodes, total num timesteps 78800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 394/10000 episodes, total num timesteps 79000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 395/10000 episodes, total num timesteps 79200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 396/10000 episodes, total num timesteps 79400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 397/10000 episodes, total num timesteps 79600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 398/10000 episodes, total num timesteps 79800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 399/10000 episodes, total num timesteps 80000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 400/10000 episodes, total num timesteps 80200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.0975860328203727
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.20232612358088972
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 10
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.32462597676821076
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 15
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.14470842373842305
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.04584380184543331
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.2511956373296217
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.15646590631233778
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.12275448316132305
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 7
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.33261927107927447
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 15
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.23131116006260702
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 11
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 401/10000 episodes, total num timesteps 80400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 402/10000 episodes, total num timesteps 80600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 403/10000 episodes, total num timesteps 80800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 404/10000 episodes, total num timesteps 81000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 405/10000 episodes, total num timesteps 81200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 406/10000 episodes, total num timesteps 81400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 407/10000 episodes, total num timesteps 81600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 408/10000 episodes, total num timesteps 81800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 409/10000 episodes, total num timesteps 82000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 410/10000 episodes, total num timesteps 82200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 411/10000 episodes, total num timesteps 82400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 412/10000 episodes, total num timesteps 82600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 413/10000 episodes, total num timesteps 82800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 414/10000 episodes, total num timesteps 83000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 415/10000 episodes, total num timesteps 83200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 416/10000 episodes, total num timesteps 83400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 417/10000 episodes, total num timesteps 83600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 418/10000 episodes, total num timesteps 83800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 419/10000 episodes, total num timesteps 84000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 420/10000 episodes, total num timesteps 84200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 421/10000 episodes, total num timesteps 84400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 422/10000 episodes, total num timesteps 84600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 423/10000 episodes, total num timesteps 84800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 424/10000 episodes, total num timesteps 85000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 425/10000 episodes, total num timesteps 85200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.5291227945873597
team_policy eval average team episode rewards of agent0: 57.5
team_policy eval idv catch total num of agent0: 23
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent1: 0.2543443626602742
team_policy eval average team episode rewards of agent1: 57.5
team_policy eval idv catch total num of agent1: 12
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent2: 0.4596787858078927
team_policy eval average team episode rewards of agent2: 57.5
team_policy eval idv catch total num of agent2: 20
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent3: 0.44985602841282846
team_policy eval average team episode rewards of agent3: 57.5
team_policy eval idv catch total num of agent3: 20
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent4: 0.19772888497748467
team_policy eval average team episode rewards of agent4: 57.5
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 23
idv_policy eval average step individual rewards of agent0: 0.03152436859222782
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.08491791090400075
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 6
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.06092192586714685
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.19103044495528207
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 10
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.16144583054197398
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 426/10000 episodes, total num timesteps 85400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 427/10000 episodes, total num timesteps 85600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 428/10000 episodes, total num timesteps 85800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 429/10000 episodes, total num timesteps 86000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 430/10000 episodes, total num timesteps 86200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 431/10000 episodes, total num timesteps 86400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 432/10000 episodes, total num timesteps 86600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 433/10000 episodes, total num timesteps 86800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 434/10000 episodes, total num timesteps 87000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 435/10000 episodes, total num timesteps 87200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 436/10000 episodes, total num timesteps 87400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 437/10000 episodes, total num timesteps 87600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 438/10000 episodes, total num timesteps 87800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 439/10000 episodes, total num timesteps 88000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 440/10000 episodes, total num timesteps 88200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 441/10000 episodes, total num timesteps 88400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 442/10000 episodes, total num timesteps 88600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 443/10000 episodes, total num timesteps 88800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 444/10000 episodes, total num timesteps 89000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 445/10000 episodes, total num timesteps 89200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 446/10000 episodes, total num timesteps 89400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 447/10000 episodes, total num timesteps 89600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 448/10000 episodes, total num timesteps 89800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 449/10000 episodes, total num timesteps 90000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 450/10000 episodes, total num timesteps 90200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.29548461168480195
team_policy eval average team episode rewards of agent0: 27.5
team_policy eval idv catch total num of agent0: 14
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent1: 0.22381262725319764
team_policy eval average team episode rewards of agent1: 27.5
team_policy eval idv catch total num of agent1: 11
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent2: 0.1745240222534144
team_policy eval average team episode rewards of agent2: 27.5
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent3: 0.1513823672389883
team_policy eval average team episode rewards of agent3: 27.5
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent4: 0.3040042573077153
team_policy eval average team episode rewards of agent4: 27.5
team_policy eval idv catch total num of agent4: 14
team_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent0: 0.022279407988250392
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.16513542661670721
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 9
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.061672910357279426
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.14325031811674627
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 8
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.12194447292451537
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 451/10000 episodes, total num timesteps 90400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 452/10000 episodes, total num timesteps 90600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 453/10000 episodes, total num timesteps 90800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 454/10000 episodes, total num timesteps 91000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 455/10000 episodes, total num timesteps 91200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 456/10000 episodes, total num timesteps 91400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 457/10000 episodes, total num timesteps 91600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 458/10000 episodes, total num timesteps 91800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 459/10000 episodes, total num timesteps 92000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 460/10000 episodes, total num timesteps 92200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 461/10000 episodes, total num timesteps 92400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 462/10000 episodes, total num timesteps 92600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 463/10000 episodes, total num timesteps 92800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 464/10000 episodes, total num timesteps 93000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 465/10000 episodes, total num timesteps 93200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 466/10000 episodes, total num timesteps 93400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 467/10000 episodes, total num timesteps 93600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 468/10000 episodes, total num timesteps 93800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 469/10000 episodes, total num timesteps 94000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 470/10000 episodes, total num timesteps 94200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 471/10000 episodes, total num timesteps 94400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 472/10000 episodes, total num timesteps 94600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 473/10000 episodes, total num timesteps 94800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 474/10000 episodes, total num timesteps 95000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 475/10000 episodes, total num timesteps 95200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.20407386001878472
team_policy eval average team episode rewards of agent0: 45.0
team_policy eval idv catch total num of agent0: 10
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent1: 0.2781256523165958
team_policy eval average team episode rewards of agent1: 45.0
team_policy eval idv catch total num of agent1: 13
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent2: 0.327915638103414
team_policy eval average team episode rewards of agent2: 45.0
team_policy eval idv catch total num of agent2: 15
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent3: 0.30034692925994977
team_policy eval average team episode rewards of agent3: 45.0
team_policy eval idv catch total num of agent3: 14
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent4: 0.37564384057731204
team_policy eval average team episode rewards of agent4: 45.0
team_policy eval idv catch total num of agent4: 17
team_policy eval team catch total num: 18
idv_policy eval average step individual rewards of agent0: 0.13229962806129628
idv_policy eval average team episode rewards of agent0: 32.5
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent1: 0.22952451084601422
idv_policy eval average team episode rewards of agent1: 32.5
idv_policy eval idv catch total num of agent1: 11
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent2: 0.35715235495351666
idv_policy eval average team episode rewards of agent2: 32.5
idv_policy eval idv catch total num of agent2: 16
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent3: 0.25170440494726326
idv_policy eval average team episode rewards of agent3: 32.5
idv_policy eval idv catch total num of agent3: 12
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent4: 0.3059306197147921
idv_policy eval average team episode rewards of agent4: 32.5
idv_policy eval idv catch total num of agent4: 14
idv_policy eval team catch total num: 13

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 476/10000 episodes, total num timesteps 95400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 477/10000 episodes, total num timesteps 95600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 478/10000 episodes, total num timesteps 95800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 479/10000 episodes, total num timesteps 96000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 480/10000 episodes, total num timesteps 96200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 481/10000 episodes, total num timesteps 96400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 482/10000 episodes, total num timesteps 96600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 483/10000 episodes, total num timesteps 96800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 484/10000 episodes, total num timesteps 97000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 485/10000 episodes, total num timesteps 97200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 486/10000 episodes, total num timesteps 97400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 487/10000 episodes, total num timesteps 97600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 488/10000 episodes, total num timesteps 97800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 489/10000 episodes, total num timesteps 98000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 490/10000 episodes, total num timesteps 98200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 491/10000 episodes, total num timesteps 98400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 492/10000 episodes, total num timesteps 98600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 493/10000 episodes, total num timesteps 98800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 494/10000 episodes, total num timesteps 99000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 495/10000 episodes, total num timesteps 99200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 496/10000 episodes, total num timesteps 99400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 497/10000 episodes, total num timesteps 99600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 498/10000 episodes, total num timesteps 99800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 499/10000 episodes, total num timesteps 100000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 500/10000 episodes, total num timesteps 100200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.279007923283566
team_policy eval average team episode rewards of agent0: 52.5
team_policy eval idv catch total num of agent0: 13
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent1: 0.46488025519154036
team_policy eval average team episode rewards of agent1: 52.5
team_policy eval idv catch total num of agent1: 20
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent2: 0.36066697692639965
team_policy eval average team episode rewards of agent2: 52.5
team_policy eval idv catch total num of agent2: 16
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent3: 0.2742208832033278
team_policy eval average team episode rewards of agent3: 52.5
team_policy eval idv catch total num of agent3: 13
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent4: 0.556117693016791
team_policy eval average team episode rewards of agent4: 52.5
team_policy eval idv catch total num of agent4: 24
team_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent0: 0.19850807585745278
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 10
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.17734534350529418
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 9
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.16517016678066515
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 9
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.2456322110854931
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 12
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.3221644566356147
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 15
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 501/10000 episodes, total num timesteps 100400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 502/10000 episodes, total num timesteps 100600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 503/10000 episodes, total num timesteps 100800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 504/10000 episodes, total num timesteps 101000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 505/10000 episodes, total num timesteps 101200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 506/10000 episodes, total num timesteps 101400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 507/10000 episodes, total num timesteps 101600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 508/10000 episodes, total num timesteps 101800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 509/10000 episodes, total num timesteps 102000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 510/10000 episodes, total num timesteps 102200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 511/10000 episodes, total num timesteps 102400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 512/10000 episodes, total num timesteps 102600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 513/10000 episodes, total num timesteps 102800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 514/10000 episodes, total num timesteps 103000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 515/10000 episodes, total num timesteps 103200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 516/10000 episodes, total num timesteps 103400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 517/10000 episodes, total num timesteps 103600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 518/10000 episodes, total num timesteps 103800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 519/10000 episodes, total num timesteps 104000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 520/10000 episodes, total num timesteps 104200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 521/10000 episodes, total num timesteps 104400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 522/10000 episodes, total num timesteps 104600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 523/10000 episodes, total num timesteps 104800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 524/10000 episodes, total num timesteps 105000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 525/10000 episodes, total num timesteps 105200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.4336328875074314
team_policy eval average team episode rewards of agent0: 50.0
team_policy eval idv catch total num of agent0: 19
team_policy eval team catch total num: 20
team_policy eval average step individual rewards of agent1: 0.3107386633235983
team_policy eval average team episode rewards of agent1: 50.0
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 20
team_policy eval average step individual rewards of agent2: 0.3258638668706173
team_policy eval average team episode rewards of agent2: 50.0
team_policy eval idv catch total num of agent2: 15
team_policy eval team catch total num: 20
team_policy eval average step individual rewards of agent3: 0.3319922743346189
team_policy eval average team episode rewards of agent3: 50.0
team_policy eval idv catch total num of agent3: 15
team_policy eval team catch total num: 20
team_policy eval average step individual rewards of agent4: 0.33016856497844893
team_policy eval average team episode rewards of agent4: 50.0
team_policy eval idv catch total num of agent4: 15
team_policy eval team catch total num: 20
idv_policy eval average step individual rewards of agent0: 0.23928736273426948
idv_policy eval average team episode rewards of agent0: 42.5
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent1: 0.3852504560874206
idv_policy eval average team episode rewards of agent1: 42.5
idv_policy eval idv catch total num of agent1: 18
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent2: 0.19281851528694113
idv_policy eval average team episode rewards of agent2: 42.5
idv_policy eval idv catch total num of agent2: 10
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent3: 0.33382634387312426
idv_policy eval average team episode rewards of agent3: 42.5
idv_policy eval idv catch total num of agent3: 16
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent4: 0.15423596178611457
idv_policy eval average team episode rewards of agent4: 42.5
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 17

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 526/10000 episodes, total num timesteps 105400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 527/10000 episodes, total num timesteps 105600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 528/10000 episodes, total num timesteps 105800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 529/10000 episodes, total num timesteps 106000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 530/10000 episodes, total num timesteps 106200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 531/10000 episodes, total num timesteps 106400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 532/10000 episodes, total num timesteps 106600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 533/10000 episodes, total num timesteps 106800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 534/10000 episodes, total num timesteps 107000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 535/10000 episodes, total num timesteps 107200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 536/10000 episodes, total num timesteps 107400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 537/10000 episodes, total num timesteps 107600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 538/10000 episodes, total num timesteps 107800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 539/10000 episodes, total num timesteps 108000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 540/10000 episodes, total num timesteps 108200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 541/10000 episodes, total num timesteps 108400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 542/10000 episodes, total num timesteps 108600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 543/10000 episodes, total num timesteps 108800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 544/10000 episodes, total num timesteps 109000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 545/10000 episodes, total num timesteps 109200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 546/10000 episodes, total num timesteps 109400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 547/10000 episodes, total num timesteps 109600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 548/10000 episodes, total num timesteps 109800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 549/10000 episodes, total num timesteps 110000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 550/10000 episodes, total num timesteps 110200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.12448816070168696
team_policy eval average team episode rewards of agent0: 30.0
team_policy eval idv catch total num of agent0: 7
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent1: 0.14498061506736878
team_policy eval average team episode rewards of agent1: 30.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent2: 0.16795473117686158
team_policy eval average team episode rewards of agent2: 30.0
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent3: 0.3294877348923834
team_policy eval average team episode rewards of agent3: 30.0
team_policy eval idv catch total num of agent3: 15
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent4: 0.07255685858210385
team_policy eval average team episode rewards of agent4: 30.0
team_policy eval idv catch total num of agent4: 5
team_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent0: 0.3605354494745414
idv_policy eval average team episode rewards of agent0: 52.5
idv_policy eval idv catch total num of agent0: 16
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent1: 0.3571738765540406
idv_policy eval average team episode rewards of agent1: 52.5
idv_policy eval idv catch total num of agent1: 16
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent2: 0.43147061249277235
idv_policy eval average team episode rewards of agent2: 52.5
idv_policy eval idv catch total num of agent2: 19
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent3: 0.3810771668388265
idv_policy eval average team episode rewards of agent3: 52.5
idv_policy eval idv catch total num of agent3: 17
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent4: 0.510956467132963
idv_policy eval average team episode rewards of agent4: 52.5
idv_policy eval idv catch total num of agent4: 22
idv_policy eval team catch total num: 21

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 551/10000 episodes, total num timesteps 110400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 552/10000 episodes, total num timesteps 110600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 553/10000 episodes, total num timesteps 110800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 554/10000 episodes, total num timesteps 111000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 555/10000 episodes, total num timesteps 111200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 556/10000 episodes, total num timesteps 111400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 557/10000 episodes, total num timesteps 111600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 558/10000 episodes, total num timesteps 111800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 559/10000 episodes, total num timesteps 112000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 560/10000 episodes, total num timesteps 112200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 561/10000 episodes, total num timesteps 112400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 562/10000 episodes, total num timesteps 112600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 563/10000 episodes, total num timesteps 112800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 564/10000 episodes, total num timesteps 113000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 565/10000 episodes, total num timesteps 113200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 566/10000 episodes, total num timesteps 113400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 567/10000 episodes, total num timesteps 113600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 568/10000 episodes, total num timesteps 113800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 569/10000 episodes, total num timesteps 114000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 570/10000 episodes, total num timesteps 114200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 571/10000 episodes, total num timesteps 114400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 572/10000 episodes, total num timesteps 114600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 573/10000 episodes, total num timesteps 114800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 574/10000 episodes, total num timesteps 115000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 575/10000 episodes, total num timesteps 115200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.12233138419067918
team_policy eval average team episode rewards of agent0: 37.5
team_policy eval idv catch total num of agent0: 7
team_policy eval team catch total num: 15
team_policy eval average step individual rewards of agent1: 0.3013495513742817
team_policy eval average team episode rewards of agent1: 37.5
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 15
team_policy eval average step individual rewards of agent2: 0.24002245959359653
team_policy eval average team episode rewards of agent2: 37.5
team_policy eval idv catch total num of agent2: 12
team_policy eval team catch total num: 15
team_policy eval average step individual rewards of agent3: 0.21228866661089996
team_policy eval average team episode rewards of agent3: 37.5
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 15
team_policy eval average step individual rewards of agent4: 0.4730297997798354
team_policy eval average team episode rewards of agent4: 37.5
team_policy eval idv catch total num of agent4: 21
team_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent0: 0.2483930864575637
idv_policy eval average team episode rewards of agent0: 32.5
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent1: 0.06900473626754407
idv_policy eval average team episode rewards of agent1: 32.5
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent2: 0.30782535884147116
idv_policy eval average team episode rewards of agent2: 32.5
idv_policy eval idv catch total num of agent2: 14
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent3: 0.3339914334637061
idv_policy eval average team episode rewards of agent3: 32.5
idv_policy eval idv catch total num of agent3: 15
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent4: 0.1692354651533444
idv_policy eval average team episode rewards of agent4: 32.5
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 13

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 576/10000 episodes, total num timesteps 115400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 577/10000 episodes, total num timesteps 115600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 578/10000 episodes, total num timesteps 115800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 579/10000 episodes, total num timesteps 116000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 580/10000 episodes, total num timesteps 116200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 581/10000 episodes, total num timesteps 116400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 582/10000 episodes, total num timesteps 116600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 583/10000 episodes, total num timesteps 116800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 584/10000 episodes, total num timesteps 117000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 585/10000 episodes, total num timesteps 117200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 586/10000 episodes, total num timesteps 117400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 587/10000 episodes, total num timesteps 117600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 588/10000 episodes, total num timesteps 117800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 589/10000 episodes, total num timesteps 118000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 590/10000 episodes, total num timesteps 118200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 591/10000 episodes, total num timesteps 118400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 592/10000 episodes, total num timesteps 118600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 593/10000 episodes, total num timesteps 118800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 594/10000 episodes, total num timesteps 119000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 595/10000 episodes, total num timesteps 119200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 596/10000 episodes, total num timesteps 119400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 597/10000 episodes, total num timesteps 119600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 598/10000 episodes, total num timesteps 119800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 599/10000 episodes, total num timesteps 120000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 600/10000 episodes, total num timesteps 120200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6509827458102022
team_policy eval average team episode rewards of agent0: 85.0
team_policy eval idv catch total num of agent0: 28
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent1: 0.6369008341354535
team_policy eval average team episode rewards of agent1: 85.0
team_policy eval idv catch total num of agent1: 27
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent2: 0.517311369204529
team_policy eval average team episode rewards of agent2: 85.0
team_policy eval idv catch total num of agent2: 22
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent3: 0.6364698704885422
team_policy eval average team episode rewards of agent3: 85.0
team_policy eval idv catch total num of agent3: 27
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent4: 0.4849695049033656
team_policy eval average team episode rewards of agent4: 85.0
team_policy eval idv catch total num of agent4: 21
team_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent0: 0.6808167175600562
idv_policy eval average team episode rewards of agent0: 75.0
idv_policy eval idv catch total num of agent0: 29
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent1: 0.5599344269336765
idv_policy eval average team episode rewards of agent1: 75.0
idv_policy eval idv catch total num of agent1: 24
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent2: 0.3511829682918603
idv_policy eval average team episode rewards of agent2: 75.0
idv_policy eval idv catch total num of agent2: 16
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent3: 0.6572979184188347
idv_policy eval average team episode rewards of agent3: 75.0
idv_policy eval idv catch total num of agent3: 28
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent4: 0.48734498669906884
idv_policy eval average team episode rewards of agent4: 75.0
idv_policy eval idv catch total num of agent4: 21
idv_policy eval team catch total num: 30

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 601/10000 episodes, total num timesteps 120400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 602/10000 episodes, total num timesteps 120600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 603/10000 episodes, total num timesteps 120800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 604/10000 episodes, total num timesteps 121000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 605/10000 episodes, total num timesteps 121200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 606/10000 episodes, total num timesteps 121400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 607/10000 episodes, total num timesteps 121600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 608/10000 episodes, total num timesteps 121800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 609/10000 episodes, total num timesteps 122000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 610/10000 episodes, total num timesteps 122200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 611/10000 episodes, total num timesteps 122400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 612/10000 episodes, total num timesteps 122600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 613/10000 episodes, total num timesteps 122800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 614/10000 episodes, total num timesteps 123000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 615/10000 episodes, total num timesteps 123200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 616/10000 episodes, total num timesteps 123400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 617/10000 episodes, total num timesteps 123600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 618/10000 episodes, total num timesteps 123800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 619/10000 episodes, total num timesteps 124000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 620/10000 episodes, total num timesteps 124200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 621/10000 episodes, total num timesteps 124400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 622/10000 episodes, total num timesteps 124600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 623/10000 episodes, total num timesteps 124800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 624/10000 episodes, total num timesteps 125000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 625/10000 episodes, total num timesteps 125200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6893230535713435
team_policy eval average team episode rewards of agent0: 80.0
team_policy eval idv catch total num of agent0: 29
team_policy eval team catch total num: 32
team_policy eval average step individual rewards of agent1: 0.6298343394682662
team_policy eval average team episode rewards of agent1: 80.0
team_policy eval idv catch total num of agent1: 27
team_policy eval team catch total num: 32
team_policy eval average step individual rewards of agent2: 0.4589608853081159
team_policy eval average team episode rewards of agent2: 80.0
team_policy eval idv catch total num of agent2: 20
team_policy eval team catch total num: 32
team_policy eval average step individual rewards of agent3: 0.6140950953736064
team_policy eval average team episode rewards of agent3: 80.0
team_policy eval idv catch total num of agent3: 26
team_policy eval team catch total num: 32
team_policy eval average step individual rewards of agent4: 0.7623469586458532
team_policy eval average team episode rewards of agent4: 80.0
team_policy eval idv catch total num of agent4: 32
team_policy eval team catch total num: 32
idv_policy eval average step individual rewards of agent0: 0.23235184057477565
idv_policy eval average team episode rewards of agent0: 42.5
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent1: 0.20231922786512144
idv_policy eval average team episode rewards of agent1: 42.5
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent2: 0.4564555623148324
idv_policy eval average team episode rewards of agent2: 42.5
idv_policy eval idv catch total num of agent2: 20
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent3: 0.12887773971948108
idv_policy eval average team episode rewards of agent3: 42.5
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent4: 0.25092149258642527
idv_policy eval average team episode rewards of agent4: 42.5
idv_policy eval idv catch total num of agent4: 12
idv_policy eval team catch total num: 17

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 626/10000 episodes, total num timesteps 125400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 627/10000 episodes, total num timesteps 125600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 628/10000 episodes, total num timesteps 125800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 629/10000 episodes, total num timesteps 126000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 630/10000 episodes, total num timesteps 126200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 631/10000 episodes, total num timesteps 126400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 632/10000 episodes, total num timesteps 126600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 633/10000 episodes, total num timesteps 126800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 634/10000 episodes, total num timesteps 127000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 635/10000 episodes, total num timesteps 127200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 636/10000 episodes, total num timesteps 127400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 637/10000 episodes, total num timesteps 127600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 638/10000 episodes, total num timesteps 127800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 639/10000 episodes, total num timesteps 128000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 640/10000 episodes, total num timesteps 128200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 641/10000 episodes, total num timesteps 128400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 642/10000 episodes, total num timesteps 128600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 643/10000 episodes, total num timesteps 128800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 644/10000 episodes, total num timesteps 129000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 645/10000 episodes, total num timesteps 129200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 646/10000 episodes, total num timesteps 129400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 647/10000 episodes, total num timesteps 129600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 648/10000 episodes, total num timesteps 129800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 649/10000 episodes, total num timesteps 130000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 650/10000 episodes, total num timesteps 130200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.3245130098114353
team_policy eval average team episode rewards of agent0: 42.5
team_policy eval idv catch total num of agent0: 15
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent1: 0.27307922681781344
team_policy eval average team episode rewards of agent1: 42.5
team_policy eval idv catch total num of agent1: 13
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent2: 0.302570531028083
team_policy eval average team episode rewards of agent2: 42.5
team_policy eval idv catch total num of agent2: 14
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent3: 0.17421843216213573
team_policy eval average team episode rewards of agent3: 42.5
team_policy eval idv catch total num of agent3: 9
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent4: 0.24173142282840568
team_policy eval average team episode rewards of agent4: 42.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent0: 0.3658917904441411
idv_policy eval average team episode rewards of agent0: 55.0
idv_policy eval idv catch total num of agent0: 16
idv_policy eval team catch total num: 22
idv_policy eval average step individual rewards of agent1: 0.5952092048616018
idv_policy eval average team episode rewards of agent1: 55.0
idv_policy eval idv catch total num of agent1: 25
idv_policy eval team catch total num: 22
idv_policy eval average step individual rewards of agent2: 0.3765305194654068
idv_policy eval average team episode rewards of agent2: 55.0
idv_policy eval idv catch total num of agent2: 17
idv_policy eval team catch total num: 22
idv_policy eval average step individual rewards of agent3: 0.33469308916942814
idv_policy eval average team episode rewards of agent3: 55.0
idv_policy eval idv catch total num of agent3: 15
idv_policy eval team catch total num: 22
idv_policy eval average step individual rewards of agent4: 0.45855338582153055
idv_policy eval average team episode rewards of agent4: 55.0
idv_policy eval idv catch total num of agent4: 20
idv_policy eval team catch total num: 22

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 651/10000 episodes, total num timesteps 130400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 652/10000 episodes, total num timesteps 130600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 653/10000 episodes, total num timesteps 130800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 654/10000 episodes, total num timesteps 131000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 655/10000 episodes, total num timesteps 131200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 656/10000 episodes, total num timesteps 131400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 657/10000 episodes, total num timesteps 131600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 658/10000 episodes, total num timesteps 131800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 659/10000 episodes, total num timesteps 132000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 660/10000 episodes, total num timesteps 132200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 661/10000 episodes, total num timesteps 132400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 662/10000 episodes, total num timesteps 132600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 663/10000 episodes, total num timesteps 132800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 664/10000 episodes, total num timesteps 133000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 665/10000 episodes, total num timesteps 133200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 666/10000 episodes, total num timesteps 133400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 667/10000 episodes, total num timesteps 133600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 668/10000 episodes, total num timesteps 133800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 669/10000 episodes, total num timesteps 134000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 670/10000 episodes, total num timesteps 134200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 671/10000 episodes, total num timesteps 134400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 672/10000 episodes, total num timesteps 134600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 673/10000 episodes, total num timesteps 134800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 674/10000 episodes, total num timesteps 135000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 675/10000 episodes, total num timesteps 135200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.3560818815089075
team_policy eval average team episode rewards of agent0: 42.5
team_policy eval idv catch total num of agent0: 16
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent1: 0.38453013473921116
team_policy eval average team episode rewards of agent1: 42.5
team_policy eval idv catch total num of agent1: 17
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent2: 0.35436262111560224
team_policy eval average team episode rewards of agent2: 42.5
team_policy eval idv catch total num of agent2: 16
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent3: 0.3022760901523009
team_policy eval average team episode rewards of agent3: 42.5
team_policy eval idv catch total num of agent3: 14
team_policy eval team catch total num: 17
team_policy eval average step individual rewards of agent4: 0.13367723441798915
team_policy eval average team episode rewards of agent4: 42.5
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent0: 0.39852790270472666
idv_policy eval average team episode rewards of agent0: 65.0
idv_policy eval idv catch total num of agent0: 18
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent1: 0.4589253305700393
idv_policy eval average team episode rewards of agent1: 65.0
idv_policy eval idv catch total num of agent1: 20
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent2: 0.5056148991209493
idv_policy eval average team episode rewards of agent2: 65.0
idv_policy eval idv catch total num of agent2: 22
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent3: 0.6092437276754435
idv_policy eval average team episode rewards of agent3: 65.0
idv_policy eval idv catch total num of agent3: 26
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent4: 0.22622748197472814
idv_policy eval average team episode rewards of agent4: 65.0
idv_policy eval idv catch total num of agent4: 11
idv_policy eval team catch total num: 26

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 676/10000 episodes, total num timesteps 135400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 677/10000 episodes, total num timesteps 135600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 678/10000 episodes, total num timesteps 135800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 679/10000 episodes, total num timesteps 136000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 680/10000 episodes, total num timesteps 136200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 681/10000 episodes, total num timesteps 136400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 682/10000 episodes, total num timesteps 136600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 683/10000 episodes, total num timesteps 136800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 684/10000 episodes, total num timesteps 137000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 685/10000 episodes, total num timesteps 137200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 686/10000 episodes, total num timesteps 137400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 687/10000 episodes, total num timesteps 137600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 688/10000 episodes, total num timesteps 137800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 689/10000 episodes, total num timesteps 138000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 690/10000 episodes, total num timesteps 138200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 691/10000 episodes, total num timesteps 138400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 692/10000 episodes, total num timesteps 138600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 693/10000 episodes, total num timesteps 138800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 694/10000 episodes, total num timesteps 139000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 695/10000 episodes, total num timesteps 139200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 696/10000 episodes, total num timesteps 139400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 697/10000 episodes, total num timesteps 139600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 698/10000 episodes, total num timesteps 139800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 699/10000 episodes, total num timesteps 140000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 700/10000 episodes, total num timesteps 140200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.8200878411924855
team_policy eval average team episode rewards of agent0: 70.0
team_policy eval idv catch total num of agent0: 34
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent1: 0.5786618394348884
team_policy eval average team episode rewards of agent1: 70.0
team_policy eval idv catch total num of agent1: 25
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent2: 0.2557231560442854
team_policy eval average team episode rewards of agent2: 70.0
team_policy eval idv catch total num of agent2: 12
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent3: 0.40919157201569256
team_policy eval average team episode rewards of agent3: 70.0
team_policy eval idv catch total num of agent3: 18
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent4: 0.26234721128370686
team_policy eval average team episode rewards of agent4: 70.0
team_policy eval idv catch total num of agent4: 12
team_policy eval team catch total num: 28
idv_policy eval average step individual rewards of agent0: 0.8093794956052439
idv_policy eval average team episode rewards of agent0: 95.0
idv_policy eval idv catch total num of agent0: 34
idv_policy eval team catch total num: 38
idv_policy eval average step individual rewards of agent1: 0.6084467238523785
idv_policy eval average team episode rewards of agent1: 95.0
idv_policy eval idv catch total num of agent1: 26
idv_policy eval team catch total num: 38
idv_policy eval average step individual rewards of agent2: 0.7327519797442221
idv_policy eval average team episode rewards of agent2: 95.0
idv_policy eval idv catch total num of agent2: 31
idv_policy eval team catch total num: 38
idv_policy eval average step individual rewards of agent3: 0.6192560509496756
idv_policy eval average team episode rewards of agent3: 95.0
idv_policy eval idv catch total num of agent3: 26
idv_policy eval team catch total num: 38
idv_policy eval average step individual rewards of agent4: 0.6687951425657568
idv_policy eval average team episode rewards of agent4: 95.0
idv_policy eval idv catch total num of agent4: 28
idv_policy eval team catch total num: 38

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 701/10000 episodes, total num timesteps 140400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 702/10000 episodes, total num timesteps 140600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 703/10000 episodes, total num timesteps 140800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 704/10000 episodes, total num timesteps 141000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 705/10000 episodes, total num timesteps 141200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 706/10000 episodes, total num timesteps 141400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 707/10000 episodes, total num timesteps 141600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 708/10000 episodes, total num timesteps 141800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 709/10000 episodes, total num timesteps 142000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 710/10000 episodes, total num timesteps 142200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 711/10000 episodes, total num timesteps 142400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 712/10000 episodes, total num timesteps 142600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 713/10000 episodes, total num timesteps 142800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 714/10000 episodes, total num timesteps 143000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 715/10000 episodes, total num timesteps 143200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 716/10000 episodes, total num timesteps 143400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 717/10000 episodes, total num timesteps 143600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 718/10000 episodes, total num timesteps 143800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 719/10000 episodes, total num timesteps 144000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 720/10000 episodes, total num timesteps 144200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 721/10000 episodes, total num timesteps 144400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 722/10000 episodes, total num timesteps 144600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 723/10000 episodes, total num timesteps 144800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 724/10000 episodes, total num timesteps 145000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 725/10000 episodes, total num timesteps 145200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.17644929168970833
team_policy eval average team episode rewards of agent0: 17.5
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent1: 0.14926934758163868
team_policy eval average team episode rewards of agent1: 17.5
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent2: 0.07612998517874466
team_policy eval average team episode rewards of agent2: 17.5
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent3: 0.026789928806604503
team_policy eval average team episode rewards of agent3: 17.5
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 7
team_policy eval average step individual rewards of agent4: 0.02239579375924702
team_policy eval average team episode rewards of agent4: 17.5
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent0: 0.3900702166508176
idv_policy eval average team episode rewards of agent0: 42.5
idv_policy eval idv catch total num of agent0: 17
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent1: 0.33707795731409484
idv_policy eval average team episode rewards of agent1: 42.5
idv_policy eval idv catch total num of agent1: 15
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent2: 0.30839427138174424
idv_policy eval average team episode rewards of agent2: 42.5
idv_policy eval idv catch total num of agent2: 14
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent3: 0.33647536167444175
idv_policy eval average team episode rewards of agent3: 42.5
idv_policy eval idv catch total num of agent3: 15
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent4: 0.23709714727075162
idv_policy eval average team episode rewards of agent4: 42.5
idv_policy eval idv catch total num of agent4: 11
idv_policy eval team catch total num: 17

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 726/10000 episodes, total num timesteps 145400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 727/10000 episodes, total num timesteps 145600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 728/10000 episodes, total num timesteps 145800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 729/10000 episodes, total num timesteps 146000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 730/10000 episodes, total num timesteps 146200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 731/10000 episodes, total num timesteps 146400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 732/10000 episodes, total num timesteps 146600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 733/10000 episodes, total num timesteps 146800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 734/10000 episodes, total num timesteps 147000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 735/10000 episodes, total num timesteps 147200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 736/10000 episodes, total num timesteps 147400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 737/10000 episodes, total num timesteps 147600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 738/10000 episodes, total num timesteps 147800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 739/10000 episodes, total num timesteps 148000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 740/10000 episodes, total num timesteps 148200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 741/10000 episodes, total num timesteps 148400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 742/10000 episodes, total num timesteps 148600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 743/10000 episodes, total num timesteps 148800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 744/10000 episodes, total num timesteps 149000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 745/10000 episodes, total num timesteps 149200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 746/10000 episodes, total num timesteps 149400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 747/10000 episodes, total num timesteps 149600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 748/10000 episodes, total num timesteps 149800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 749/10000 episodes, total num timesteps 150000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 750/10000 episodes, total num timesteps 150200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.46166960125112866
team_policy eval average team episode rewards of agent0: 65.0
team_policy eval idv catch total num of agent0: 20
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent1: 0.38025685267296455
team_policy eval average team episode rewards of agent1: 65.0
team_policy eval idv catch total num of agent1: 17
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent2: 0.5367066173961343
team_policy eval average team episode rewards of agent2: 65.0
team_policy eval idv catch total num of agent2: 23
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent3: 0.455634496285757
team_policy eval average team episode rewards of agent3: 65.0
team_policy eval idv catch total num of agent3: 20
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent4: 0.4848039209213185
team_policy eval average team episode rewards of agent4: 65.0
team_policy eval idv catch total num of agent4: 21
team_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent0: 0.311502370543851
idv_policy eval average team episode rewards of agent0: 75.0
idv_policy eval idv catch total num of agent0: 14
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent1: 0.6655302780365793
idv_policy eval average team episode rewards of agent1: 75.0
idv_policy eval idv catch total num of agent1: 28
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent2: 0.7184234796247663
idv_policy eval average team episode rewards of agent2: 75.0
idv_policy eval idv catch total num of agent2: 30
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent3: 0.532191060434897
idv_policy eval average team episode rewards of agent3: 75.0
idv_policy eval idv catch total num of agent3: 23
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent4: 0.5919510650556978
idv_policy eval average team episode rewards of agent4: 75.0
idv_policy eval idv catch total num of agent4: 25
idv_policy eval team catch total num: 30

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 751/10000 episodes, total num timesteps 150400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 752/10000 episodes, total num timesteps 150600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 753/10000 episodes, total num timesteps 150800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 754/10000 episodes, total num timesteps 151000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 755/10000 episodes, total num timesteps 151200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 756/10000 episodes, total num timesteps 151400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 757/10000 episodes, total num timesteps 151600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 758/10000 episodes, total num timesteps 151800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 759/10000 episodes, total num timesteps 152000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 760/10000 episodes, total num timesteps 152200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 761/10000 episodes, total num timesteps 152400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 762/10000 episodes, total num timesteps 152600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 763/10000 episodes, total num timesteps 152800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 764/10000 episodes, total num timesteps 153000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 765/10000 episodes, total num timesteps 153200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 766/10000 episodes, total num timesteps 153400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 767/10000 episodes, total num timesteps 153600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 768/10000 episodes, total num timesteps 153800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 769/10000 episodes, total num timesteps 154000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 770/10000 episodes, total num timesteps 154200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 771/10000 episodes, total num timesteps 154400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 772/10000 episodes, total num timesteps 154600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 773/10000 episodes, total num timesteps 154800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 774/10000 episodes, total num timesteps 155000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 775/10000 episodes, total num timesteps 155200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6648316303360595
team_policy eval average team episode rewards of agent0: 90.0
team_policy eval idv catch total num of agent0: 28
team_policy eval team catch total num: 36
team_policy eval average step individual rewards of agent1: 0.8385504000105343
team_policy eval average team episode rewards of agent1: 90.0
team_policy eval idv catch total num of agent1: 35
team_policy eval team catch total num: 36
team_policy eval average step individual rewards of agent2: 0.6806926481850161
team_policy eval average team episode rewards of agent2: 90.0
team_policy eval idv catch total num of agent2: 29
team_policy eval team catch total num: 36
team_policy eval average step individual rewards of agent3: 0.4790138546476706
team_policy eval average team episode rewards of agent3: 90.0
team_policy eval idv catch total num of agent3: 21
team_policy eval team catch total num: 36
team_policy eval average step individual rewards of agent4: 0.8117630666808426
team_policy eval average team episode rewards of agent4: 90.0
team_policy eval idv catch total num of agent4: 34
team_policy eval team catch total num: 36
idv_policy eval average step individual rewards of agent0: 0.48314805877153977
idv_policy eval average team episode rewards of agent0: 60.0
idv_policy eval idv catch total num of agent0: 21
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent1: 0.44544932192799847
idv_policy eval average team episode rewards of agent1: 60.0
idv_policy eval idv catch total num of agent1: 19
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent2: 0.515213519902542
idv_policy eval average team episode rewards of agent2: 60.0
idv_policy eval idv catch total num of agent2: 22
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent3: 0.44349367440480897
idv_policy eval average team episode rewards of agent3: 60.0
idv_policy eval idv catch total num of agent3: 19
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent4: 0.7356443507763177
idv_policy eval average team episode rewards of agent4: 60.0
idv_policy eval idv catch total num of agent4: 31
idv_policy eval team catch total num: 24

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 776/10000 episodes, total num timesteps 155400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 777/10000 episodes, total num timesteps 155600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 778/10000 episodes, total num timesteps 155800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 779/10000 episodes, total num timesteps 156000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 780/10000 episodes, total num timesteps 156200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 781/10000 episodes, total num timesteps 156400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 782/10000 episodes, total num timesteps 156600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 783/10000 episodes, total num timesteps 156800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 784/10000 episodes, total num timesteps 157000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 785/10000 episodes, total num timesteps 157200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 786/10000 episodes, total num timesteps 157400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 787/10000 episodes, total num timesteps 157600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 788/10000 episodes, total num timesteps 157800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 789/10000 episodes, total num timesteps 158000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 790/10000 episodes, total num timesteps 158200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 791/10000 episodes, total num timesteps 158400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 792/10000 episodes, total num timesteps 158600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 793/10000 episodes, total num timesteps 158800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 794/10000 episodes, total num timesteps 159000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 795/10000 episodes, total num timesteps 159200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 796/10000 episodes, total num timesteps 159400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 797/10000 episodes, total num timesteps 159600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 798/10000 episodes, total num timesteps 159800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 799/10000 episodes, total num timesteps 160000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 800/10000 episodes, total num timesteps 160200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6642686767268335
team_policy eval average team episode rewards of agent0: 77.5
team_policy eval idv catch total num of agent0: 28
team_policy eval team catch total num: 31
team_policy eval average step individual rewards of agent1: 0.4738922881397595
team_policy eval average team episode rewards of agent1: 77.5
team_policy eval idv catch total num of agent1: 21
team_policy eval team catch total num: 31
team_policy eval average step individual rewards of agent2: 0.4085792056687446
team_policy eval average team episode rewards of agent2: 77.5
team_policy eval idv catch total num of agent2: 18
team_policy eval team catch total num: 31
team_policy eval average step individual rewards of agent3: 0.5093755860557965
team_policy eval average team episode rewards of agent3: 77.5
team_policy eval idv catch total num of agent3: 22
team_policy eval team catch total num: 31
team_policy eval average step individual rewards of agent4: 0.7648997826137867
team_policy eval average team episode rewards of agent4: 77.5
team_policy eval idv catch total num of agent4: 32
team_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent0: 0.4849353184156169
idv_policy eval average team episode rewards of agent0: 75.0
idv_policy eval idv catch total num of agent0: 21
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent1: 0.6370125379962315
idv_policy eval average team episode rewards of agent1: 75.0
idv_policy eval idv catch total num of agent1: 27
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent2: 0.6352147811690376
idv_policy eval average team episode rewards of agent2: 75.0
idv_policy eval idv catch total num of agent2: 27
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent3: 0.6319048735747477
idv_policy eval average team episode rewards of agent3: 75.0
idv_policy eval idv catch total num of agent3: 27
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent4: 0.4384055014017326
idv_policy eval average team episode rewards of agent4: 75.0
idv_policy eval idv catch total num of agent4: 19
idv_policy eval team catch total num: 30

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 801/10000 episodes, total num timesteps 160400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 802/10000 episodes, total num timesteps 160600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 803/10000 episodes, total num timesteps 160800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 804/10000 episodes, total num timesteps 161000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 805/10000 episodes, total num timesteps 161200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 806/10000 episodes, total num timesteps 161400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 807/10000 episodes, total num timesteps 161600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 808/10000 episodes, total num timesteps 161800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 809/10000 episodes, total num timesteps 162000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 810/10000 episodes, total num timesteps 162200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 811/10000 episodes, total num timesteps 162400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 812/10000 episodes, total num timesteps 162600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 813/10000 episodes, total num timesteps 162800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 814/10000 episodes, total num timesteps 163000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 815/10000 episodes, total num timesteps 163200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 816/10000 episodes, total num timesteps 163400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 817/10000 episodes, total num timesteps 163600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 818/10000 episodes, total num timesteps 163800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 819/10000 episodes, total num timesteps 164000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 820/10000 episodes, total num timesteps 164200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 821/10000 episodes, total num timesteps 164400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 822/10000 episodes, total num timesteps 164600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 823/10000 episodes, total num timesteps 164800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 824/10000 episodes, total num timesteps 165000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 825/10000 episodes, total num timesteps 165200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.433811358122122
team_policy eval average team episode rewards of agent0: 45.0
team_policy eval idv catch total num of agent0: 19
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent1: 0.3100329840682201
team_policy eval average team episode rewards of agent1: 45.0
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent2: 0.3284210555463292
team_policy eval average team episode rewards of agent2: 45.0
team_policy eval idv catch total num of agent2: 15
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent3: 0.3573405651998586
team_policy eval average team episode rewards of agent3: 45.0
team_policy eval idv catch total num of agent3: 16
team_policy eval team catch total num: 18
team_policy eval average step individual rewards of agent4: 0.3055919086822799
team_policy eval average team episode rewards of agent4: 45.0
team_policy eval idv catch total num of agent4: 14
team_policy eval team catch total num: 18
idv_policy eval average step individual rewards of agent0: 0.7146019624607396
idv_policy eval average team episode rewards of agent0: 77.5
idv_policy eval idv catch total num of agent0: 30
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent1: 0.5879319423675716
idv_policy eval average team episode rewards of agent1: 77.5
idv_policy eval idv catch total num of agent1: 25
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent2: 0.6651787335443466
idv_policy eval average team episode rewards of agent2: 77.5
idv_policy eval idv catch total num of agent2: 28
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent3: 0.36706250603431356
idv_policy eval average team episode rewards of agent3: 77.5
idv_policy eval idv catch total num of agent3: 16
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent4: 0.5123079989474015
idv_policy eval average team episode rewards of agent4: 77.5
idv_policy eval idv catch total num of agent4: 22
idv_policy eval team catch total num: 31

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 826/10000 episodes, total num timesteps 165400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 827/10000 episodes, total num timesteps 165600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 828/10000 episodes, total num timesteps 165800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 829/10000 episodes, total num timesteps 166000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 830/10000 episodes, total num timesteps 166200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 831/10000 episodes, total num timesteps 166400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 832/10000 episodes, total num timesteps 166600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 833/10000 episodes, total num timesteps 166800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 834/10000 episodes, total num timesteps 167000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 835/10000 episodes, total num timesteps 167200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 836/10000 episodes, total num timesteps 167400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 837/10000 episodes, total num timesteps 167600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 838/10000 episodes, total num timesteps 167800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 839/10000 episodes, total num timesteps 168000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 840/10000 episodes, total num timesteps 168200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 841/10000 episodes, total num timesteps 168400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 842/10000 episodes, total num timesteps 168600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 843/10000 episodes, total num timesteps 168800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 844/10000 episodes, total num timesteps 169000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 845/10000 episodes, total num timesteps 169200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 846/10000 episodes, total num timesteps 169400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 847/10000 episodes, total num timesteps 169600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 848/10000 episodes, total num timesteps 169800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 849/10000 episodes, total num timesteps 170000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 850/10000 episodes, total num timesteps 170200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.35864171483200863
team_policy eval average team episode rewards of agent0: 52.5
team_policy eval idv catch total num of agent0: 16
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent1: 0.2814713945737248
team_policy eval average team episode rewards of agent1: 52.5
team_policy eval idv catch total num of agent1: 13
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent2: 0.5918354640336623
team_policy eval average team episode rewards of agent2: 52.5
team_policy eval idv catch total num of agent2: 25
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent3: 0.3288756936031305
team_policy eval average team episode rewards of agent3: 52.5
team_policy eval idv catch total num of agent3: 15
team_policy eval team catch total num: 21
team_policy eval average step individual rewards of agent4: 0.6388736407971505
team_policy eval average team episode rewards of agent4: 52.5
team_policy eval idv catch total num of agent4: 27
team_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent0: 0.6105867934204228
idv_policy eval average team episode rewards of agent0: 65.0
idv_policy eval idv catch total num of agent0: 26
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent1: 0.43080715128432956
idv_policy eval average team episode rewards of agent1: 65.0
idv_policy eval idv catch total num of agent1: 19
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent2: 0.39787381867184995
idv_policy eval average team episode rewards of agent2: 65.0
idv_policy eval idv catch total num of agent2: 18
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent3: 0.7334093051205365
idv_policy eval average team episode rewards of agent3: 65.0
idv_policy eval idv catch total num of agent3: 31
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent4: 0.2794657812067918
idv_policy eval average team episode rewards of agent4: 65.0
idv_policy eval idv catch total num of agent4: 13
idv_policy eval team catch total num: 26

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 851/10000 episodes, total num timesteps 170400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 852/10000 episodes, total num timesteps 170600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 853/10000 episodes, total num timesteps 170800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 854/10000 episodes, total num timesteps 171000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 855/10000 episodes, total num timesteps 171200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 856/10000 episodes, total num timesteps 171400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 857/10000 episodes, total num timesteps 171600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 858/10000 episodes, total num timesteps 171800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 859/10000 episodes, total num timesteps 172000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 860/10000 episodes, total num timesteps 172200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 861/10000 episodes, total num timesteps 172400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 862/10000 episodes, total num timesteps 172600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 863/10000 episodes, total num timesteps 172800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 864/10000 episodes, total num timesteps 173000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 865/10000 episodes, total num timesteps 173200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 866/10000 episodes, total num timesteps 173400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 867/10000 episodes, total num timesteps 173600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 868/10000 episodes, total num timesteps 173800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 869/10000 episodes, total num timesteps 174000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 870/10000 episodes, total num timesteps 174200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 871/10000 episodes, total num timesteps 174400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 872/10000 episodes, total num timesteps 174600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 873/10000 episodes, total num timesteps 174800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 874/10000 episodes, total num timesteps 175000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 875/10000 episodes, total num timesteps 175200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6842886549165328
team_policy eval average team episode rewards of agent0: 85.0
team_policy eval idv catch total num of agent0: 29
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent1: 0.4190121644272831
team_policy eval average team episode rewards of agent1: 85.0
team_policy eval idv catch total num of agent1: 18
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent2: 0.6382346746573165
team_policy eval average team episode rewards of agent2: 85.0
team_policy eval idv catch total num of agent2: 27
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent3: 0.7879277120623607
team_policy eval average team episode rewards of agent3: 85.0
team_policy eval idv catch total num of agent3: 33
team_policy eval team catch total num: 34
team_policy eval average step individual rewards of agent4: 0.45981142103582157
team_policy eval average team episode rewards of agent4: 85.0
team_policy eval idv catch total num of agent4: 20
team_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent0: 0.487692262073565
idv_policy eval average team episode rewards of agent0: 52.5
idv_policy eval idv catch total num of agent0: 21
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent1: 0.43995075461755156
idv_policy eval average team episode rewards of agent1: 52.5
idv_policy eval idv catch total num of agent1: 19
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent2: 0.5913216753099154
idv_policy eval average team episode rewards of agent2: 52.5
idv_policy eval idv catch total num of agent2: 25
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent3: 0.4413326275084659
idv_policy eval average team episode rewards of agent3: 52.5
idv_policy eval idv catch total num of agent3: 19
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent4: 0.326610657170507
idv_policy eval average team episode rewards of agent4: 52.5
idv_policy eval idv catch total num of agent4: 15
idv_policy eval team catch total num: 21

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 876/10000 episodes, total num timesteps 175400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 877/10000 episodes, total num timesteps 175600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 878/10000 episodes, total num timesteps 175800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 879/10000 episodes, total num timesteps 176000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 880/10000 episodes, total num timesteps 176200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 881/10000 episodes, total num timesteps 176400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 882/10000 episodes, total num timesteps 176600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 883/10000 episodes, total num timesteps 176800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 884/10000 episodes, total num timesteps 177000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 885/10000 episodes, total num timesteps 177200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 886/10000 episodes, total num timesteps 177400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 887/10000 episodes, total num timesteps 177600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 888/10000 episodes, total num timesteps 177800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 889/10000 episodes, total num timesteps 178000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 890/10000 episodes, total num timesteps 178200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 891/10000 episodes, total num timesteps 178400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 892/10000 episodes, total num timesteps 178600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 893/10000 episodes, total num timesteps 178800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 894/10000 episodes, total num timesteps 179000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 895/10000 episodes, total num timesteps 179200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 896/10000 episodes, total num timesteps 179400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 897/10000 episodes, total num timesteps 179600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 898/10000 episodes, total num timesteps 179800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 899/10000 episodes, total num timesteps 180000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 900/10000 episodes, total num timesteps 180200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.310791059507437
team_policy eval average team episode rewards of agent0: 55.0
team_policy eval idv catch total num of agent0: 14
team_policy eval team catch total num: 22
team_policy eval average step individual rewards of agent1: 0.6769064969549804
team_policy eval average team episode rewards of agent1: 55.0
team_policy eval idv catch total num of agent1: 29
team_policy eval team catch total num: 22
team_policy eval average step individual rewards of agent2: 0.48275759855899814
team_policy eval average team episode rewards of agent2: 55.0
team_policy eval idv catch total num of agent2: 21
team_policy eval team catch total num: 22
team_policy eval average step individual rewards of agent3: 0.2878413395042958
team_policy eval average team episode rewards of agent3: 55.0
team_policy eval idv catch total num of agent3: 14
team_policy eval team catch total num: 22
team_policy eval average step individual rewards of agent4: 0.4625818139481717
team_policy eval average team episode rewards of agent4: 55.0
team_policy eval idv catch total num of agent4: 20
team_policy eval team catch total num: 22
idv_policy eval average step individual rewards of agent0: 0.7893600547344828
idv_policy eval average team episode rewards of agent0: 110.0
idv_policy eval idv catch total num of agent0: 33
idv_policy eval team catch total num: 44
idv_policy eval average step individual rewards of agent1: 0.5852868281204698
idv_policy eval average team episode rewards of agent1: 110.0
idv_policy eval idv catch total num of agent1: 25
idv_policy eval team catch total num: 44
idv_policy eval average step individual rewards of agent2: 0.812020733156393
idv_policy eval average team episode rewards of agent2: 110.0
idv_policy eval idv catch total num of agent2: 34
idv_policy eval team catch total num: 44
idv_policy eval average step individual rewards of agent3: 0.965360565433895
idv_policy eval average team episode rewards of agent3: 110.0
idv_policy eval idv catch total num of agent3: 40
idv_policy eval team catch total num: 44
idv_policy eval average step individual rewards of agent4: 0.8943425873807118
idv_policy eval average team episode rewards of agent4: 110.0
idv_policy eval idv catch total num of agent4: 37
idv_policy eval team catch total num: 44

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 901/10000 episodes, total num timesteps 180400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 902/10000 episodes, total num timesteps 180600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 903/10000 episodes, total num timesteps 180800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 904/10000 episodes, total num timesteps 181000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 905/10000 episodes, total num timesteps 181200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 906/10000 episodes, total num timesteps 181400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 907/10000 episodes, total num timesteps 181600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 908/10000 episodes, total num timesteps 181800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 909/10000 episodes, total num timesteps 182000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 910/10000 episodes, total num timesteps 182200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 911/10000 episodes, total num timesteps 182400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 912/10000 episodes, total num timesteps 182600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 913/10000 episodes, total num timesteps 182800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 914/10000 episodes, total num timesteps 183000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 915/10000 episodes, total num timesteps 183200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 916/10000 episodes, total num timesteps 183400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 917/10000 episodes, total num timesteps 183600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 918/10000 episodes, total num timesteps 183800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 919/10000 episodes, total num timesteps 184000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 920/10000 episodes, total num timesteps 184200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 921/10000 episodes, total num timesteps 184400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 922/10000 episodes, total num timesteps 184600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 923/10000 episodes, total num timesteps 184800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 924/10000 episodes, total num timesteps 185000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 925/10000 episodes, total num timesteps 185200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.5414228765833683
team_policy eval average team episode rewards of agent0: 57.5
team_policy eval idv catch total num of agent0: 23
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent1: 0.30366113221672303
team_policy eval average team episode rewards of agent1: 57.5
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent2: 0.3628210662533662
team_policy eval average team episode rewards of agent2: 57.5
team_policy eval idv catch total num of agent2: 16
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent3: 0.423582255736615
team_policy eval average team episode rewards of agent3: 57.5
team_policy eval idv catch total num of agent3: 19
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent4: 0.7741297754320273
team_policy eval average team episode rewards of agent4: 57.5
team_policy eval idv catch total num of agent4: 32
team_policy eval team catch total num: 23
idv_policy eval average step individual rewards of agent0: 0.6411268526460941
idv_policy eval average team episode rewards of agent0: 60.0
idv_policy eval idv catch total num of agent0: 27
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent1: 0.4635249866386677
idv_policy eval average team episode rewards of agent1: 60.0
idv_policy eval idv catch total num of agent1: 20
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent2: 0.35600263796447246
idv_policy eval average team episode rewards of agent2: 60.0
idv_policy eval idv catch total num of agent2: 16
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent3: 0.20030725920132095
idv_policy eval average team episode rewards of agent3: 60.0
idv_policy eval idv catch total num of agent3: 10
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent4: 0.6145032983163774
idv_policy eval average team episode rewards of agent4: 60.0
idv_policy eval idv catch total num of agent4: 26
idv_policy eval team catch total num: 24

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 926/10000 episodes, total num timesteps 185400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 927/10000 episodes, total num timesteps 185600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 928/10000 episodes, total num timesteps 185800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 929/10000 episodes, total num timesteps 186000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 930/10000 episodes, total num timesteps 186200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 931/10000 episodes, total num timesteps 186400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 932/10000 episodes, total num timesteps 186600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 933/10000 episodes, total num timesteps 186800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 934/10000 episodes, total num timesteps 187000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 935/10000 episodes, total num timesteps 187200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 936/10000 episodes, total num timesteps 187400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 937/10000 episodes, total num timesteps 187600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 938/10000 episodes, total num timesteps 187800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 939/10000 episodes, total num timesteps 188000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 940/10000 episodes, total num timesteps 188200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 941/10000 episodes, total num timesteps 188400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 942/10000 episodes, total num timesteps 188600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 943/10000 episodes, total num timesteps 188800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 944/10000 episodes, total num timesteps 189000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 945/10000 episodes, total num timesteps 189200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 946/10000 episodes, total num timesteps 189400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 947/10000 episodes, total num timesteps 189600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 948/10000 episodes, total num timesteps 189800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 949/10000 episodes, total num timesteps 190000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 950/10000 episodes, total num timesteps 190200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.41003557381119576
team_policy eval average team episode rewards of agent0: 70.0
team_policy eval idv catch total num of agent0: 18
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent1: 0.7356259265685433
team_policy eval average team episode rewards of agent1: 70.0
team_policy eval idv catch total num of agent1: 31
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent2: 0.7167495646790872
team_policy eval average team episode rewards of agent2: 70.0
team_policy eval idv catch total num of agent2: 30
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent3: 0.3561621458114177
team_policy eval average team episode rewards of agent3: 70.0
team_policy eval idv catch total num of agent3: 16
team_policy eval team catch total num: 28
team_policy eval average step individual rewards of agent4: 0.7363314673683796
team_policy eval average team episode rewards of agent4: 70.0
team_policy eval idv catch total num of agent4: 31
team_policy eval team catch total num: 28
idv_policy eval average step individual rewards of agent0: 0.49328480878120046
idv_policy eval average team episode rewards of agent0: 52.5
idv_policy eval idv catch total num of agent0: 21
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent1: 0.3847990717905711
idv_policy eval average team episode rewards of agent1: 52.5
idv_policy eval idv catch total num of agent1: 17
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent2: 0.36089430914101556
idv_policy eval average team episode rewards of agent2: 52.5
idv_policy eval idv catch total num of agent2: 16
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent3: 0.5571837560738828
idv_policy eval average team episode rewards of agent3: 52.5
idv_policy eval idv catch total num of agent3: 24
idv_policy eval team catch total num: 21
idv_policy eval average step individual rewards of agent4: 0.38549254059191385
idv_policy eval average team episode rewards of agent4: 52.5
idv_policy eval idv catch total num of agent4: 17
idv_policy eval team catch total num: 21

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 951/10000 episodes, total num timesteps 190400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 952/10000 episodes, total num timesteps 190600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 953/10000 episodes, total num timesteps 190800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 954/10000 episodes, total num timesteps 191000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 955/10000 episodes, total num timesteps 191200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 956/10000 episodes, total num timesteps 191400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 957/10000 episodes, total num timesteps 191600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 958/10000 episodes, total num timesteps 191800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 959/10000 episodes, total num timesteps 192000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 960/10000 episodes, total num timesteps 192200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 961/10000 episodes, total num timesteps 192400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 962/10000 episodes, total num timesteps 192600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 963/10000 episodes, total num timesteps 192800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 964/10000 episodes, total num timesteps 193000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 965/10000 episodes, total num timesteps 193200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 966/10000 episodes, total num timesteps 193400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 967/10000 episodes, total num timesteps 193600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 968/10000 episodes, total num timesteps 193800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 969/10000 episodes, total num timesteps 194000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 970/10000 episodes, total num timesteps 194200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 971/10000 episodes, total num timesteps 194400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 972/10000 episodes, total num timesteps 194600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 973/10000 episodes, total num timesteps 194800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 974/10000 episodes, total num timesteps 195000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 975/10000 episodes, total num timesteps 195200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 1.065802562849733
team_policy eval average team episode rewards of agent0: 122.5
team_policy eval idv catch total num of agent0: 44
team_policy eval team catch total num: 49
team_policy eval average step individual rewards of agent1: 0.8680720038673734
team_policy eval average team episode rewards of agent1: 122.5
team_policy eval idv catch total num of agent1: 36
team_policy eval team catch total num: 49
team_policy eval average step individual rewards of agent2: 0.691299647649864
team_policy eval average team episode rewards of agent2: 122.5
team_policy eval idv catch total num of agent2: 29
team_policy eval team catch total num: 49
team_policy eval average step individual rewards of agent3: 0.9392264600569823
team_policy eval average team episode rewards of agent3: 122.5
team_policy eval idv catch total num of agent3: 39
team_policy eval team catch total num: 49
team_policy eval average step individual rewards of agent4: 0.8154978254453573
team_policy eval average team episode rewards of agent4: 122.5
team_policy eval idv catch total num of agent4: 34
team_policy eval team catch total num: 49
idv_policy eval average step individual rewards of agent0: 0.5377533745619982
idv_policy eval average team episode rewards of agent0: 82.5
idv_policy eval idv catch total num of agent0: 23
idv_policy eval team catch total num: 33
idv_policy eval average step individual rewards of agent1: 0.4328870333270154
idv_policy eval average team episode rewards of agent1: 82.5
idv_policy eval idv catch total num of agent1: 19
idv_policy eval team catch total num: 33
idv_policy eval average step individual rewards of agent2: 0.5644167540661527
idv_policy eval average team episode rewards of agent2: 82.5
idv_policy eval idv catch total num of agent2: 24
idv_policy eval team catch total num: 33
idv_policy eval average step individual rewards of agent3: 0.7419896457056904
idv_policy eval average team episode rewards of agent3: 82.5
idv_policy eval idv catch total num of agent3: 31
idv_policy eval team catch total num: 33
idv_policy eval average step individual rewards of agent4: 0.7663443726081133
idv_policy eval average team episode rewards of agent4: 82.5
idv_policy eval idv catch total num of agent4: 32
idv_policy eval team catch total num: 33

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 976/10000 episodes, total num timesteps 195400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 977/10000 episodes, total num timesteps 195600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 978/10000 episodes, total num timesteps 195800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 979/10000 episodes, total num timesteps 196000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 980/10000 episodes, total num timesteps 196200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 981/10000 episodes, total num timesteps 196400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 982/10000 episodes, total num timesteps 196600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 983/10000 episodes, total num timesteps 196800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 984/10000 episodes, total num timesteps 197000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 985/10000 episodes, total num timesteps 197200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 986/10000 episodes, total num timesteps 197400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 987/10000 episodes, total num timesteps 197600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 988/10000 episodes, total num timesteps 197800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 989/10000 episodes, total num timesteps 198000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 990/10000 episodes, total num timesteps 198200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 991/10000 episodes, total num timesteps 198400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 992/10000 episodes, total num timesteps 198600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 993/10000 episodes, total num timesteps 198800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 994/10000 episodes, total num timesteps 199000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 995/10000 episodes, total num timesteps 199200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 996/10000 episodes, total num timesteps 199400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 997/10000 episodes, total num timesteps 199600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 998/10000 episodes, total num timesteps 199800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 999/10000 episodes, total num timesteps 200000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1000/10000 episodes, total num timesteps 200200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.4036978317131424
team_policy eval average team episode rewards of agent0: 67.5
team_policy eval idv catch total num of agent0: 18
team_policy eval team catch total num: 27
team_policy eval average step individual rewards of agent1: 0.5076566787183439
team_policy eval average team episode rewards of agent1: 67.5
team_policy eval idv catch total num of agent1: 22
team_policy eval team catch total num: 27
team_policy eval average step individual rewards of agent2: 0.4121467192978359
team_policy eval average team episode rewards of agent2: 67.5
team_policy eval idv catch total num of agent2: 18
team_policy eval team catch total num: 27
team_policy eval average step individual rewards of agent3: 0.478117449021796
team_policy eval average team episode rewards of agent3: 67.5
team_policy eval idv catch total num of agent3: 21
team_policy eval team catch total num: 27
team_policy eval average step individual rewards of agent4: 0.6635278011536585
team_policy eval average team episode rewards of agent4: 67.5
team_policy eval idv catch total num of agent4: 28
team_policy eval team catch total num: 27
idv_policy eval average step individual rewards of agent0: 0.6107504538909692
idv_policy eval average team episode rewards of agent0: 80.0
idv_policy eval idv catch total num of agent0: 26
idv_policy eval team catch total num: 32
idv_policy eval average step individual rewards of agent1: 0.46024527569528423
idv_policy eval average team episode rewards of agent1: 80.0
idv_policy eval idv catch total num of agent1: 20
idv_policy eval team catch total num: 32
idv_policy eval average step individual rewards of agent2: 0.40670831606917657
idv_policy eval average team episode rewards of agent2: 80.0
idv_policy eval idv catch total num of agent2: 18
idv_policy eval team catch total num: 32
idv_policy eval average step individual rewards of agent3: 0.6395374014036892
idv_policy eval average team episode rewards of agent3: 80.0
idv_policy eval idv catch total num of agent3: 27
idv_policy eval team catch total num: 32
idv_policy eval average step individual rewards of agent4: 0.43090805845697633
idv_policy eval average team episode rewards of agent4: 80.0
idv_policy eval idv catch total num of agent4: 19
idv_policy eval team catch total num: 32

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1001/10000 episodes, total num timesteps 200400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1002/10000 episodes, total num timesteps 200600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1003/10000 episodes, total num timesteps 200800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1004/10000 episodes, total num timesteps 201000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1005/10000 episodes, total num timesteps 201200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1006/10000 episodes, total num timesteps 201400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1007/10000 episodes, total num timesteps 201600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1008/10000 episodes, total num timesteps 201800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1009/10000 episodes, total num timesteps 202000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1010/10000 episodes, total num timesteps 202200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1011/10000 episodes, total num timesteps 202400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1012/10000 episodes, total num timesteps 202600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1013/10000 episodes, total num timesteps 202800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1014/10000 episodes, total num timesteps 203000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1015/10000 episodes, total num timesteps 203200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1016/10000 episodes, total num timesteps 203400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1017/10000 episodes, total num timesteps 203600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1018/10000 episodes, total num timesteps 203800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1019/10000 episodes, total num timesteps 204000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1020/10000 episodes, total num timesteps 204200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1021/10000 episodes, total num timesteps 204400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1022/10000 episodes, total num timesteps 204600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1023/10000 episodes, total num timesteps 204800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1024/10000 episodes, total num timesteps 205000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1025/10000 episodes, total num timesteps 205200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 1.1383365076243177
team_policy eval average team episode rewards of agent0: 145.0
team_policy eval idv catch total num of agent0: 47
team_policy eval team catch total num: 58
team_policy eval average step individual rewards of agent1: 1.0215770221653466
team_policy eval average team episode rewards of agent1: 145.0
team_policy eval idv catch total num of agent1: 42
team_policy eval team catch total num: 58
team_policy eval average step individual rewards of agent2: 0.7633243557794072
team_policy eval average team episode rewards of agent2: 145.0
team_policy eval idv catch total num of agent2: 32
team_policy eval team catch total num: 58
team_policy eval average step individual rewards of agent3: 0.7161710622713607
team_policy eval average team episode rewards of agent3: 145.0
team_policy eval idv catch total num of agent3: 30
team_policy eval team catch total num: 58
team_policy eval average step individual rewards of agent4: 0.9762424309996898
team_policy eval average team episode rewards of agent4: 145.0
team_policy eval idv catch total num of agent4: 40
team_policy eval team catch total num: 58
idv_policy eval average step individual rewards of agent0: 0.27887605182058406
idv_policy eval average team episode rewards of agent0: 37.5
idv_policy eval idv catch total num of agent0: 13
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent1: 0.21594774151150836
idv_policy eval average team episode rewards of agent1: 37.5
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent2: 0.4063255877980991
idv_policy eval average team episode rewards of agent2: 37.5
idv_policy eval idv catch total num of agent2: 18
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent3: 0.2752947119680452
idv_policy eval average team episode rewards of agent3: 37.5
idv_policy eval idv catch total num of agent3: 13
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent4: 0.13946055414327885
idv_policy eval average team episode rewards of agent4: 37.5
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 15

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1026/10000 episodes, total num timesteps 205400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1027/10000 episodes, total num timesteps 205600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1028/10000 episodes, total num timesteps 205800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1029/10000 episodes, total num timesteps 206000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1030/10000 episodes, total num timesteps 206200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1031/10000 episodes, total num timesteps 206400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1032/10000 episodes, total num timesteps 206600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1033/10000 episodes, total num timesteps 206800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1034/10000 episodes, total num timesteps 207000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1035/10000 episodes, total num timesteps 207200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1036/10000 episodes, total num timesteps 207400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1037/10000 episodes, total num timesteps 207600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1038/10000 episodes, total num timesteps 207800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1039/10000 episodes, total num timesteps 208000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1040/10000 episodes, total num timesteps 208200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1041/10000 episodes, total num timesteps 208400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1042/10000 episodes, total num timesteps 208600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1043/10000 episodes, total num timesteps 208800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1044/10000 episodes, total num timesteps 209000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1045/10000 episodes, total num timesteps 209200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1046/10000 episodes, total num timesteps 209400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1047/10000 episodes, total num timesteps 209600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1048/10000 episodes, total num timesteps 209800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1049/10000 episodes, total num timesteps 210000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1050/10000 episodes, total num timesteps 210200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.48681154993360676
team_policy eval average team episode rewards of agent0: 87.5
team_policy eval idv catch total num of agent0: 21
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent1: 0.453746000501837
team_policy eval average team episode rewards of agent1: 87.5
team_policy eval idv catch total num of agent1: 20
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent2: 0.7410724019810337
team_policy eval average team episode rewards of agent2: 87.5
team_policy eval idv catch total num of agent2: 31
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent3: 0.7880370362217022
team_policy eval average team episode rewards of agent3: 87.5
team_policy eval idv catch total num of agent3: 33
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent4: 0.4516048898002627
team_policy eval average team episode rewards of agent4: 87.5
team_policy eval idv catch total num of agent4: 20
team_policy eval team catch total num: 35
idv_policy eval average step individual rewards of agent0: 0.478139266719359
idv_policy eval average team episode rewards of agent0: 37.5
idv_policy eval idv catch total num of agent0: 21
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent1: 0.2002122015156135
idv_policy eval average team episode rewards of agent1: 37.5
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent2: 0.1039589276615067
idv_policy eval average team episode rewards of agent2: 37.5
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent3: 0.5166682186657102
idv_policy eval average team episode rewards of agent3: 37.5
idv_policy eval idv catch total num of agent3: 22
idv_policy eval team catch total num: 15
idv_policy eval average step individual rewards of agent4: 0.1478864730788234
idv_policy eval average team episode rewards of agent4: 37.5
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 15

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1051/10000 episodes, total num timesteps 210400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1052/10000 episodes, total num timesteps 210600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1053/10000 episodes, total num timesteps 210800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1054/10000 episodes, total num timesteps 211000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1055/10000 episodes, total num timesteps 211200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1056/10000 episodes, total num timesteps 211400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1057/10000 episodes, total num timesteps 211600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1058/10000 episodes, total num timesteps 211800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1059/10000 episodes, total num timesteps 212000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1060/10000 episodes, total num timesteps 212200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1061/10000 episodes, total num timesteps 212400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1062/10000 episodes, total num timesteps 212600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1063/10000 episodes, total num timesteps 212800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1064/10000 episodes, total num timesteps 213000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1065/10000 episodes, total num timesteps 213200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1066/10000 episodes, total num timesteps 213400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1067/10000 episodes, total num timesteps 213600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1068/10000 episodes, total num timesteps 213800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1069/10000 episodes, total num timesteps 214000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1070/10000 episodes, total num timesteps 214200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1071/10000 episodes, total num timesteps 214400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1072/10000 episodes, total num timesteps 214600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1073/10000 episodes, total num timesteps 214800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1074/10000 episodes, total num timesteps 215000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1075/10000 episodes, total num timesteps 215200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.3617296881891782
team_policy eval average team episode rewards of agent0: 57.5
team_policy eval idv catch total num of agent0: 16
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent1: 0.6403990178509978
team_policy eval average team episode rewards of agent1: 57.5
team_policy eval idv catch total num of agent1: 27
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent2: 0.3552788534178818
team_policy eval average team episode rewards of agent2: 57.5
team_policy eval idv catch total num of agent2: 16
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent3: 0.5659568117022883
team_policy eval average team episode rewards of agent3: 57.5
team_policy eval idv catch total num of agent3: 24
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent4: 0.3247201055957094
team_policy eval average team episode rewards of agent4: 57.5
team_policy eval idv catch total num of agent4: 15
team_policy eval team catch total num: 23
idv_policy eval average step individual rewards of agent0: 0.9212777530675463
idv_policy eval average team episode rewards of agent0: 112.5
idv_policy eval idv catch total num of agent0: 38
idv_policy eval team catch total num: 45
idv_policy eval average step individual rewards of agent1: 0.7900853942181898
idv_policy eval average team episode rewards of agent1: 112.5
idv_policy eval idv catch total num of agent1: 33
idv_policy eval team catch total num: 45
idv_policy eval average step individual rewards of agent2: 0.6373784375380465
idv_policy eval average team episode rewards of agent2: 112.5
idv_policy eval idv catch total num of agent2: 27
idv_policy eval team catch total num: 45
idv_policy eval average step individual rewards of agent3: 1.0668988854395267
idv_policy eval average team episode rewards of agent3: 112.5
idv_policy eval idv catch total num of agent3: 44
idv_policy eval team catch total num: 45
idv_policy eval average step individual rewards of agent4: 0.7343929679074452
idv_policy eval average team episode rewards of agent4: 112.5
idv_policy eval idv catch total num of agent4: 31
idv_policy eval team catch total num: 45

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1076/10000 episodes, total num timesteps 215400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1077/10000 episodes, total num timesteps 215600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1078/10000 episodes, total num timesteps 215800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1079/10000 episodes, total num timesteps 216000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1080/10000 episodes, total num timesteps 216200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1081/10000 episodes, total num timesteps 216400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1082/10000 episodes, total num timesteps 216600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1083/10000 episodes, total num timesteps 216800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1084/10000 episodes, total num timesteps 217000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1085/10000 episodes, total num timesteps 217200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1086/10000 episodes, total num timesteps 217400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1087/10000 episodes, total num timesteps 217600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1088/10000 episodes, total num timesteps 217800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1089/10000 episodes, total num timesteps 218000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1090/10000 episodes, total num timesteps 218200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1091/10000 episodes, total num timesteps 218400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1092/10000 episodes, total num timesteps 218600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1093/10000 episodes, total num timesteps 218800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1094/10000 episodes, total num timesteps 219000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1095/10000 episodes, total num timesteps 219200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1096/10000 episodes, total num timesteps 219400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1097/10000 episodes, total num timesteps 219600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1098/10000 episodes, total num timesteps 219800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1099/10000 episodes, total num timesteps 220000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1100/10000 episodes, total num timesteps 220200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.5874599868854518
team_policy eval average team episode rewards of agent0: 82.5
team_policy eval idv catch total num of agent0: 25
team_policy eval team catch total num: 33
team_policy eval average step individual rewards of agent1: 0.5176926675464226
team_policy eval average team episode rewards of agent1: 82.5
team_policy eval idv catch total num of agent1: 22
team_policy eval team catch total num: 33
team_policy eval average step individual rewards of agent2: 0.5831059317388211
team_policy eval average team episode rewards of agent2: 82.5
team_policy eval idv catch total num of agent2: 25
team_policy eval team catch total num: 33
team_policy eval average step individual rewards of agent3: 0.5410283511791563
team_policy eval average team episode rewards of agent3: 82.5
team_policy eval idv catch total num of agent3: 23
team_policy eval team catch total num: 33
team_policy eval average step individual rewards of agent4: 1.0657543862186576
team_policy eval average team episode rewards of agent4: 82.5
team_policy eval idv catch total num of agent4: 44
team_policy eval team catch total num: 33
idv_policy eval average step individual rewards of agent0: 0.8682010575479066
idv_policy eval average team episode rewards of agent0: 75.0
idv_policy eval idv catch total num of agent0: 36
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent1: 0.5369357916450088
idv_policy eval average team episode rewards of agent1: 75.0
idv_policy eval idv catch total num of agent1: 23
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent2: 0.5343771506883281
idv_policy eval average team episode rewards of agent2: 75.0
idv_policy eval idv catch total num of agent2: 23
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent3: 0.5154932351040444
idv_policy eval average team episode rewards of agent3: 75.0
idv_policy eval idv catch total num of agent3: 22
idv_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent4: 0.2528845716517163
idv_policy eval average team episode rewards of agent4: 75.0
idv_policy eval idv catch total num of agent4: 12
idv_policy eval team catch total num: 30

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1101/10000 episodes, total num timesteps 220400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1102/10000 episodes, total num timesteps 220600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1103/10000 episodes, total num timesteps 220800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1104/10000 episodes, total num timesteps 221000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1105/10000 episodes, total num timesteps 221200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1106/10000 episodes, total num timesteps 221400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1107/10000 episodes, total num timesteps 221600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1108/10000 episodes, total num timesteps 221800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1109/10000 episodes, total num timesteps 222000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1110/10000 episodes, total num timesteps 222200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1111/10000 episodes, total num timesteps 222400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1112/10000 episodes, total num timesteps 222600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1113/10000 episodes, total num timesteps 222800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1114/10000 episodes, total num timesteps 223000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1115/10000 episodes, total num timesteps 223200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1116/10000 episodes, total num timesteps 223400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1117/10000 episodes, total num timesteps 223600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1118/10000 episodes, total num timesteps 223800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1119/10000 episodes, total num timesteps 224000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1120/10000 episodes, total num timesteps 224200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1121/10000 episodes, total num timesteps 224400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1122/10000 episodes, total num timesteps 224600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1123/10000 episodes, total num timesteps 224800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1124/10000 episodes, total num timesteps 225000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1125/10000 episodes, total num timesteps 225200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.5354667905761263
team_policy eval average team episode rewards of agent0: 65.0
team_policy eval idv catch total num of agent0: 23
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent1: 0.38343867529737297
team_policy eval average team episode rewards of agent1: 65.0
team_policy eval idv catch total num of agent1: 17
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent2: 0.736597121002664
team_policy eval average team episode rewards of agent2: 65.0
team_policy eval idv catch total num of agent2: 31
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent3: 0.42814390830740906
team_policy eval average team episode rewards of agent3: 65.0
team_policy eval idv catch total num of agent3: 19
team_policy eval team catch total num: 26
team_policy eval average step individual rewards of agent4: 0.487375310457325
team_policy eval average team episode rewards of agent4: 65.0
team_policy eval idv catch total num of agent4: 21
team_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent0: 0.5909142777011854
idv_policy eval average team episode rewards of agent0: 90.0
idv_policy eval idv catch total num of agent0: 25
idv_policy eval team catch total num: 36
idv_policy eval average step individual rewards of agent1: 0.3862807999066549
idv_policy eval average team episode rewards of agent1: 90.0
idv_policy eval idv catch total num of agent1: 17
idv_policy eval team catch total num: 36
idv_policy eval average step individual rewards of agent2: 0.8202690480475305
idv_policy eval average team episode rewards of agent2: 90.0
idv_policy eval idv catch total num of agent2: 34
idv_policy eval team catch total num: 36
idv_policy eval average step individual rewards of agent3: 0.4381087627578834
idv_policy eval average team episode rewards of agent3: 90.0
idv_policy eval idv catch total num of agent3: 19
idv_policy eval team catch total num: 36
idv_policy eval average step individual rewards of agent4: 0.6658380122979679
idv_policy eval average team episode rewards of agent4: 90.0
idv_policy eval idv catch total num of agent4: 28
idv_policy eval team catch total num: 36

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1126/10000 episodes, total num timesteps 225400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1127/10000 episodes, total num timesteps 225600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1128/10000 episodes, total num timesteps 225800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1129/10000 episodes, total num timesteps 226000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1130/10000 episodes, total num timesteps 226200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1131/10000 episodes, total num timesteps 226400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1132/10000 episodes, total num timesteps 226600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1133/10000 episodes, total num timesteps 226800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1134/10000 episodes, total num timesteps 227000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1135/10000 episodes, total num timesteps 227200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1136/10000 episodes, total num timesteps 227400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1137/10000 episodes, total num timesteps 227600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1138/10000 episodes, total num timesteps 227800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1139/10000 episodes, total num timesteps 228000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1140/10000 episodes, total num timesteps 228200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1141/10000 episodes, total num timesteps 228400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1142/10000 episodes, total num timesteps 228600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1143/10000 episodes, total num timesteps 228800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1144/10000 episodes, total num timesteps 229000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1145/10000 episodes, total num timesteps 229200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1146/10000 episodes, total num timesteps 229400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1147/10000 episodes, total num timesteps 229600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1148/10000 episodes, total num timesteps 229800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1149/10000 episodes, total num timesteps 230000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1150/10000 episodes, total num timesteps 230200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6367472578995604
team_policy eval average team episode rewards of agent0: 75.0
team_policy eval idv catch total num of agent0: 27
team_policy eval team catch total num: 30
team_policy eval average step individual rewards of agent1: 0.22775023677909906
team_policy eval average team episode rewards of agent1: 75.0
team_policy eval idv catch total num of agent1: 11
team_policy eval team catch total num: 30
team_policy eval average step individual rewards of agent2: 0.3863707098247737
team_policy eval average team episode rewards of agent2: 75.0
team_policy eval idv catch total num of agent2: 17
team_policy eval team catch total num: 30
team_policy eval average step individual rewards of agent3: 0.45994158524826084
team_policy eval average team episode rewards of agent3: 75.0
team_policy eval idv catch total num of agent3: 20
team_policy eval team catch total num: 30
team_policy eval average step individual rewards of agent4: 0.7886459357763346
team_policy eval average team episode rewards of agent4: 75.0
team_policy eval idv catch total num of agent4: 33
team_policy eval team catch total num: 30
idv_policy eval average step individual rewards of agent0: 0.3657896588695908
idv_policy eval average team episode rewards of agent0: 65.0
idv_policy eval idv catch total num of agent0: 16
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent1: 0.7348673108987572
idv_policy eval average team episode rewards of agent1: 65.0
idv_policy eval idv catch total num of agent1: 31
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent2: 0.48807842296843856
idv_policy eval average team episode rewards of agent2: 65.0
idv_policy eval idv catch total num of agent2: 21
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent3: 0.42843740542235503
idv_policy eval average team episode rewards of agent3: 65.0
idv_policy eval idv catch total num of agent3: 19
idv_policy eval team catch total num: 26
idv_policy eval average step individual rewards of agent4: 0.6338722299786606
idv_policy eval average team episode rewards of agent4: 65.0
idv_policy eval idv catch total num of agent4: 27
idv_policy eval team catch total num: 26

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1151/10000 episodes, total num timesteps 230400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1152/10000 episodes, total num timesteps 230600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1153/10000 episodes, total num timesteps 230800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1154/10000 episodes, total num timesteps 231000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1155/10000 episodes, total num timesteps 231200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1156/10000 episodes, total num timesteps 231400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1157/10000 episodes, total num timesteps 231600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1158/10000 episodes, total num timesteps 231800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1159/10000 episodes, total num timesteps 232000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1160/10000 episodes, total num timesteps 232200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1161/10000 episodes, total num timesteps 232400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1162/10000 episodes, total num timesteps 232600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1163/10000 episodes, total num timesteps 232800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1164/10000 episodes, total num timesteps 233000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1165/10000 episodes, total num timesteps 233200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1166/10000 episodes, total num timesteps 233400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1167/10000 episodes, total num timesteps 233600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1168/10000 episodes, total num timesteps 233800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1169/10000 episodes, total num timesteps 234000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1170/10000 episodes, total num timesteps 234200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1171/10000 episodes, total num timesteps 234400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1172/10000 episodes, total num timesteps 234600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1173/10000 episodes, total num timesteps 234800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1174/10000 episodes, total num timesteps 235000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1175/10000 episodes, total num timesteps 235200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.7898100284412907
team_policy eval average team episode rewards of agent0: 102.5
team_policy eval idv catch total num of agent0: 33
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent1: 0.41516572078849945
team_policy eval average team episode rewards of agent1: 102.5
team_policy eval idv catch total num of agent1: 18
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent2: 0.820350010149911
team_policy eval average team episode rewards of agent2: 102.5
team_policy eval idv catch total num of agent2: 34
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent3: 0.8217872482333642
team_policy eval average team episode rewards of agent3: 102.5
team_policy eval idv catch total num of agent3: 34
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent4: 0.5082918736918703
team_policy eval average team episode rewards of agent4: 102.5
team_policy eval idv catch total num of agent4: 22
team_policy eval team catch total num: 41
idv_policy eval average step individual rewards of agent0: 0.5326992358767093
idv_policy eval average team episode rewards of agent0: 60.0
idv_policy eval idv catch total num of agent0: 23
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent1: 0.412909423385519
idv_policy eval average team episode rewards of agent1: 60.0
idv_policy eval idv catch total num of agent1: 18
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent2: 0.7215677619916306
idv_policy eval average team episode rewards of agent2: 60.0
idv_policy eval idv catch total num of agent2: 30
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent3: 0.5078020667063794
idv_policy eval average team episode rewards of agent3: 60.0
idv_policy eval idv catch total num of agent3: 22
idv_policy eval team catch total num: 24
idv_policy eval average step individual rewards of agent4: 0.35019881918032786
idv_policy eval average team episode rewards of agent4: 60.0
idv_policy eval idv catch total num of agent4: 16
idv_policy eval team catch total num: 24

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1176/10000 episodes, total num timesteps 235400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1177/10000 episodes, total num timesteps 235600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1178/10000 episodes, total num timesteps 235800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1179/10000 episodes, total num timesteps 236000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1180/10000 episodes, total num timesteps 236200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1181/10000 episodes, total num timesteps 236400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1182/10000 episodes, total num timesteps 236600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1183/10000 episodes, total num timesteps 236800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1184/10000 episodes, total num timesteps 237000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1185/10000 episodes, total num timesteps 237200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1186/10000 episodes, total num timesteps 237400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1187/10000 episodes, total num timesteps 237600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1188/10000 episodes, total num timesteps 237800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1189/10000 episodes, total num timesteps 238000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1190/10000 episodes, total num timesteps 238200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1191/10000 episodes, total num timesteps 238400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1192/10000 episodes, total num timesteps 238600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1193/10000 episodes, total num timesteps 238800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1194/10000 episodes, total num timesteps 239000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1195/10000 episodes, total num timesteps 239200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1196/10000 episodes, total num timesteps 239400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1197/10000 episodes, total num timesteps 239600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1198/10000 episodes, total num timesteps 239800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1199/10000 episodes, total num timesteps 240000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1200/10000 episodes, total num timesteps 240200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6269291315546327
team_policy eval average team episode rewards of agent0: 97.5
team_policy eval idv catch total num of agent0: 27
team_policy eval team catch total num: 39
team_policy eval average step individual rewards of agent1: 0.6080322297003358
team_policy eval average team episode rewards of agent1: 97.5
team_policy eval idv catch total num of agent1: 26
team_policy eval team catch total num: 39
team_policy eval average step individual rewards of agent2: 0.6382866547353515
team_policy eval average team episode rewards of agent2: 97.5
team_policy eval idv catch total num of agent2: 27
team_policy eval team catch total num: 39
team_policy eval average step individual rewards of agent3: 0.7663408704746503
team_policy eval average team episode rewards of agent3: 97.5
team_policy eval idv catch total num of agent3: 32
team_policy eval team catch total num: 39
team_policy eval average step individual rewards of agent4: 0.6848526726576872
team_policy eval average team episode rewards of agent4: 97.5
team_policy eval idv catch total num of agent4: 29
team_policy eval team catch total num: 39
idv_policy eval average step individual rewards of agent0: 0.38266837634890905
idv_policy eval average team episode rewards of agent0: 77.5
idv_policy eval idv catch total num of agent0: 17
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent1: 0.5299519427263136
idv_policy eval average team episode rewards of agent1: 77.5
idv_policy eval idv catch total num of agent1: 23
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent2: 0.5033619098040552
idv_policy eval average team episode rewards of agent2: 77.5
idv_policy eval idv catch total num of agent2: 22
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent3: 0.588232414668556
idv_policy eval average team episode rewards of agent3: 77.5
idv_policy eval idv catch total num of agent3: 25
idv_policy eval team catch total num: 31
idv_policy eval average step individual rewards of agent4: 0.7432142272482831
idv_policy eval average team episode rewards of agent4: 77.5
idv_policy eval idv catch total num of agent4: 31
idv_policy eval team catch total num: 31

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1201/10000 episodes, total num timesteps 240400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1202/10000 episodes, total num timesteps 240600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1203/10000 episodes, total num timesteps 240800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1204/10000 episodes, total num timesteps 241000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1205/10000 episodes, total num timesteps 241200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1206/10000 episodes, total num timesteps 241400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1207/10000 episodes, total num timesteps 241600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1208/10000 episodes, total num timesteps 241800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1209/10000 episodes, total num timesteps 242000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1210/10000 episodes, total num timesteps 242200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1211/10000 episodes, total num timesteps 242400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1212/10000 episodes, total num timesteps 242600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1213/10000 episodes, total num timesteps 242800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1214/10000 episodes, total num timesteps 243000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1215/10000 episodes, total num timesteps 243200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1216/10000 episodes, total num timesteps 243400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1217/10000 episodes, total num timesteps 243600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1218/10000 episodes, total num timesteps 243800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1219/10000 episodes, total num timesteps 244000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1220/10000 episodes, total num timesteps 244200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1221/10000 episodes, total num timesteps 244400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1222/10000 episodes, total num timesteps 244600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1223/10000 episodes, total num timesteps 244800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1224/10000 episodes, total num timesteps 245000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1225/10000 episodes, total num timesteps 245200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.48268959847113224
team_policy eval average team episode rewards of agent0: 87.5
team_policy eval idv catch total num of agent0: 21
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent1: 0.5156842738191356
team_policy eval average team episode rewards of agent1: 87.5
team_policy eval idv catch total num of agent1: 22
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent2: 0.5696305756507202
team_policy eval average team episode rewards of agent2: 87.5
team_policy eval idv catch total num of agent2: 24
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent3: 0.5369059071936056
team_policy eval average team episode rewards of agent3: 87.5
team_policy eval idv catch total num of agent3: 23
team_policy eval team catch total num: 35
team_policy eval average step individual rewards of agent4: 0.7935405352552368
team_policy eval average team episode rewards of agent4: 87.5
team_policy eval idv catch total num of agent4: 33
team_policy eval team catch total num: 35
idv_policy eval average step individual rewards of agent0: 0.25900931031450797
idv_policy eval average team episode rewards of agent0: 42.5
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent1: 0.1553189351837955
idv_policy eval average team episode rewards of agent1: 42.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent2: 0.5676602438278765
idv_policy eval average team episode rewards of agent2: 42.5
idv_policy eval idv catch total num of agent2: 24
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent3: 0.3179367162212579
idv_policy eval average team episode rewards of agent3: 42.5
idv_policy eval idv catch total num of agent3: 14
idv_policy eval team catch total num: 17
idv_policy eval average step individual rewards of agent4: 0.13412051420358545
idv_policy eval average team episode rewards of agent4: 42.5
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 17

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1226/10000 episodes, total num timesteps 245400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1227/10000 episodes, total num timesteps 245600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1228/10000 episodes, total num timesteps 245800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1229/10000 episodes, total num timesteps 246000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1230/10000 episodes, total num timesteps 246200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1231/10000 episodes, total num timesteps 246400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1232/10000 episodes, total num timesteps 246600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1233/10000 episodes, total num timesteps 246800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1234/10000 episodes, total num timesteps 247000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1235/10000 episodes, total num timesteps 247200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1236/10000 episodes, total num timesteps 247400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1237/10000 episodes, total num timesteps 247600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1238/10000 episodes, total num timesteps 247800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1239/10000 episodes, total num timesteps 248000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1240/10000 episodes, total num timesteps 248200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1241/10000 episodes, total num timesteps 248400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1242/10000 episodes, total num timesteps 248600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1243/10000 episodes, total num timesteps 248800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1244/10000 episodes, total num timesteps 249000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1245/10000 episodes, total num timesteps 249200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1246/10000 episodes, total num timesteps 249400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1247/10000 episodes, total num timesteps 249600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1248/10000 episodes, total num timesteps 249800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1249/10000 episodes, total num timesteps 250000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1250/10000 episodes, total num timesteps 250200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.8729738261637519
team_policy eval average team episode rewards of agent0: 102.5
team_policy eval idv catch total num of agent0: 36
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent1: 0.8599059736870697
team_policy eval average team episode rewards of agent1: 102.5
team_policy eval idv catch total num of agent1: 36
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent2: 0.6570033805850996
team_policy eval average team episode rewards of agent2: 102.5
team_policy eval idv catch total num of agent2: 28
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent3: 0.6562704060342038
team_policy eval average team episode rewards of agent3: 102.5
team_policy eval idv catch total num of agent3: 28
team_policy eval team catch total num: 41
team_policy eval average step individual rewards of agent4: 1.0201255374477913
team_policy eval average team episode rewards of agent4: 102.5
team_policy eval idv catch total num of agent4: 42
team_policy eval team catch total num: 41
idv_policy eval average step individual rewards of agent0: 1.0240067252230676
idv_policy eval average team episode rewards of agent0: 127.5
idv_policy eval idv catch total num of agent0: 42
idv_policy eval team catch total num: 51
idv_policy eval average step individual rewards of agent1: 0.9154898637417318
idv_policy eval average team episode rewards of agent1: 127.5
idv_policy eval idv catch total num of agent1: 38
idv_policy eval team catch total num: 51
idv_policy eval average step individual rewards of agent2: 0.8964061913398109
idv_policy eval average team episode rewards of agent2: 127.5
idv_policy eval idv catch total num of agent2: 37
idv_policy eval team catch total num: 51
idv_policy eval average step individual rewards of agent3: 0.9616810908316029
idv_policy eval average team episode rewards of agent3: 127.5
idv_policy eval idv catch total num of agent3: 40
idv_policy eval team catch total num: 51
idv_policy eval average step individual rewards of agent4: 0.7405179708977411
idv_policy eval average team episode rewards of agent4: 127.5
idv_policy eval idv catch total num of agent4: 31
idv_policy eval team catch total num: 51

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1251/10000 episodes, total num timesteps 250400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1252/10000 episodes, total num timesteps 250600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1253/10000 episodes, total num timesteps 250800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1254/10000 episodes, total num timesteps 251000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1255/10000 episodes, total num timesteps 251200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1256/10000 episodes, total num timesteps 251400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1257/10000 episodes, total num timesteps 251600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1258/10000 episodes, total num timesteps 251800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1259/10000 episodes, total num timesteps 252000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1260/10000 episodes, total num timesteps 252200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1261/10000 episodes, total num timesteps 252400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1262/10000 episodes, total num timesteps 252600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1263/10000 episodes, total num timesteps 252800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1264/10000 episodes, total num timesteps 253000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1265/10000 episodes, total num timesteps 253200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1266/10000 episodes, total num timesteps 253400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1267/10000 episodes, total num timesteps 253600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1268/10000 episodes, total num timesteps 253800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1269/10000 episodes, total num timesteps 254000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1270/10000 episodes, total num timesteps 254200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1271/10000 episodes, total num timesteps 254400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1272/10000 episodes, total num timesteps 254600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1273/10000 episodes, total num timesteps 254800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1274/10000 episodes, total num timesteps 255000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1275/10000 episodes, total num timesteps 255200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.6115667789791496
team_policy eval average team episode rewards of agent0: 57.5
team_policy eval idv catch total num of agent0: 26
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent1: 0.4896206959216145
team_policy eval average team episode rewards of agent1: 57.5
team_policy eval idv catch total num of agent1: 21
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent2: 0.33039071418138916
team_policy eval average team episode rewards of agent2: 57.5
team_policy eval idv catch total num of agent2: 15
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent3: 0.6755966710494602
team_policy eval average team episode rewards of agent3: 57.5
team_policy eval idv catch total num of agent3: 29
team_policy eval team catch total num: 23
team_policy eval average step individual rewards of agent4: 0.22892847239033587
team_policy eval average team episode rewards of agent4: 57.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 23
idv_policy eval average step individual rewards of agent0: 0.5891926584627192
idv_policy eval average team episode rewards of agent0: 85.0
idv_policy eval idv catch total num of agent0: 25
idv_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent1: 0.5073997161556432
idv_policy eval average team episode rewards of agent1: 85.0
idv_policy eval idv catch total num of agent1: 22
idv_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent2: 0.3802748386469697
idv_policy eval average team episode rewards of agent2: 85.0
idv_policy eval idv catch total num of agent2: 17
idv_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent3: 0.5428961268038169
idv_policy eval average team episode rewards of agent3: 85.0
idv_policy eval idv catch total num of agent3: 23
idv_policy eval team catch total num: 34
idv_policy eval average step individual rewards of agent4: 0.5880538321355876
idv_policy eval average team episode rewards of agent4: 85.0
idv_policy eval idv catch total num of agent4: 25
idv_policy eval team catch total num: 34

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1276/10000 episodes, total num timesteps 255400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1277/10000 episodes, total num timesteps 255600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1278/10000 episodes, total num timesteps 255800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1279/10000 episodes, total num timesteps 256000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1280/10000 episodes, total num timesteps 256200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1281/10000 episodes, total num timesteps 256400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1282/10000 episodes, total num timesteps 256600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1283/10000 episodes, total num timesteps 256800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1284/10000 episodes, total num timesteps 257000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1285/10000 episodes, total num timesteps 257200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1286/10000 episodes, total num timesteps 257400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kl_s2r2_v1 updates 1287/10000 episodes, total num timesteps 257600/2000000, FPS 178.

