wandb: Currently logged in as: 804703098. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.16.5
wandb: Run data is saved locally in /home/user/zhangyang/PycharmProjects/Nips2024-ITPC-v2/Nips2024-ITPC-v2/onpolicy/scripts/results/MPE/simple_tag_tr/rmappotrsyn/exp_train_continue_tag_base_kltcp_s2r2_v1/wandb/run-20240402_144955-5mugd24s
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run MPE_1
wandb: ⭐️ View project at https://wandb.ai/804703098/Continue_Tag_Base_v1
wandb: 🚀 View run at https://wandb.ai/804703098/Continue_Tag_Base_v1/runs/5mugd24s/workspace
choose to use gpu...
idv policy and team policy use same initial params!

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 0/10000 episodes, total num timesteps 200/2000000, FPS 139.

team_policy eval average step individual rewards of agent0: 0.038377836753397725
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 4
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: 0.10960849088008637
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: 0.012806603839515929
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.029917883747848695
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: 0.025933985314769235
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.11332629381743445
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.081715955864065
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: 0.018157102651989002
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.05633408994170927
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.048461036855224994
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1/10000 episodes, total num timesteps 400/2000000, FPS 158.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 2/10000 episodes, total num timesteps 600/2000000, FPS 165.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 3/10000 episodes, total num timesteps 800/2000000, FPS 167.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 4/10000 episodes, total num timesteps 1000/2000000, FPS 171.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 5/10000 episodes, total num timesteps 1200/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 6/10000 episodes, total num timesteps 1400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 7/10000 episodes, total num timesteps 1600/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 8/10000 episodes, total num timesteps 1800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 9/10000 episodes, total num timesteps 2000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 10/10000 episodes, total num timesteps 2200/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 11/10000 episodes, total num timesteps 2400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 12/10000 episodes, total num timesteps 2600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 13/10000 episodes, total num timesteps 2800/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 14/10000 episodes, total num timesteps 3000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 15/10000 episodes, total num timesteps 3200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 16/10000 episodes, total num timesteps 3400/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 17/10000 episodes, total num timesteps 3600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 18/10000 episodes, total num timesteps 3800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 19/10000 episodes, total num timesteps 4000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 20/10000 episodes, total num timesteps 4200/2000000, FPS 173.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 21/10000 episodes, total num timesteps 4400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 22/10000 episodes, total num timesteps 4600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 23/10000 episodes, total num timesteps 4800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 24/10000 episodes, total num timesteps 5000/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 25/10000 episodes, total num timesteps 5200/2000000, FPS 175.

team_policy eval average step individual rewards of agent0: -0.10891211825605548
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: -0.03067182618772465
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: -0.06936832847742033
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 1
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.01961131915810772
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: -0.09417821260826947
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 0
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.14912658577881646
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.04134918557935771
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.13971515057074954
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.09061427687648449
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.12372472052128711
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 0
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 26/10000 episodes, total num timesteps 5400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 27/10000 episodes, total num timesteps 5600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 28/10000 episodes, total num timesteps 5800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 29/10000 episodes, total num timesteps 6000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 30/10000 episodes, total num timesteps 6200/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 31/10000 episodes, total num timesteps 6400/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 32/10000 episodes, total num timesteps 6600/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 33/10000 episodes, total num timesteps 6800/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 34/10000 episodes, total num timesteps 7000/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 35/10000 episodes, total num timesteps 7200/2000000, FPS 174.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 36/10000 episodes, total num timesteps 7400/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 37/10000 episodes, total num timesteps 7600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 38/10000 episodes, total num timesteps 7800/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 39/10000 episodes, total num timesteps 8000/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 40/10000 episodes, total num timesteps 8200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 41/10000 episodes, total num timesteps 8400/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 42/10000 episodes, total num timesteps 8600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 43/10000 episodes, total num timesteps 8800/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 44/10000 episodes, total num timesteps 9000/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 45/10000 episodes, total num timesteps 9200/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 46/10000 episodes, total num timesteps 9400/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 47/10000 episodes, total num timesteps 9600/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 48/10000 episodes, total num timesteps 9800/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 49/10000 episodes, total num timesteps 10000/2000000, FPS 175.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 50/10000 episodes, total num timesteps 10200/2000000, FPS 176.

team_policy eval average step individual rewards of agent0: -0.10908934939015183
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: -0.13992628591351597
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: -0.126335269078398
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.12507749272373844
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: -0.05785800751742246
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.06305001188450202
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: -0.09640515463458495
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.15891350853393513
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.13578566739277317
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 0
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: -0.12983749592750285
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 51/10000 episodes, total num timesteps 10400/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 52/10000 episodes, total num timesteps 10600/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 53/10000 episodes, total num timesteps 10800/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 54/10000 episodes, total num timesteps 11000/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 55/10000 episodes, total num timesteps 11200/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 56/10000 episodes, total num timesteps 11400/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 57/10000 episodes, total num timesteps 11600/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 58/10000 episodes, total num timesteps 11800/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 59/10000 episodes, total num timesteps 12000/2000000, FPS 176.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 60/10000 episodes, total num timesteps 12200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 61/10000 episodes, total num timesteps 12400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 62/10000 episodes, total num timesteps 12600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 63/10000 episodes, total num timesteps 12800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 64/10000 episodes, total num timesteps 13000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 65/10000 episodes, total num timesteps 13200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 66/10000 episodes, total num timesteps 13400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 67/10000 episodes, total num timesteps 13600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 68/10000 episodes, total num timesteps 13800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 69/10000 episodes, total num timesteps 14000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 70/10000 episodes, total num timesteps 14200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 71/10000 episodes, total num timesteps 14400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 72/10000 episodes, total num timesteps 14600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 73/10000 episodes, total num timesteps 14800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 74/10000 episodes, total num timesteps 15000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 75/10000 episodes, total num timesteps 15200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: -0.06272872590450493
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: -0.12347659454149763
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 0
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: -0.08299099178467788
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.09116436134863333
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: -0.03581903199390923
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: -0.1187500614384317
idv_policy eval average team episode rewards of agent0: 2.5
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent1: 0.09251982789218552
idv_policy eval average team episode rewards of agent1: 2.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent2: -0.05801297634469802
idv_policy eval average team episode rewards of agent2: 2.5
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent3: -0.06664974730289507
idv_policy eval average team episode rewards of agent3: 2.5
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent4: -0.019383448710535135
idv_policy eval average team episode rewards of agent4: 2.5
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 1

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 76/10000 episodes, total num timesteps 15400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 77/10000 episodes, total num timesteps 15600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 78/10000 episodes, total num timesteps 15800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 79/10000 episodes, total num timesteps 16000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 80/10000 episodes, total num timesteps 16200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 81/10000 episodes, total num timesteps 16400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 82/10000 episodes, total num timesteps 16600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 83/10000 episodes, total num timesteps 16800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 84/10000 episodes, total num timesteps 17000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 85/10000 episodes, total num timesteps 17200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 86/10000 episodes, total num timesteps 17400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 87/10000 episodes, total num timesteps 17600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 88/10000 episodes, total num timesteps 17800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 89/10000 episodes, total num timesteps 18000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 90/10000 episodes, total num timesteps 18200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 91/10000 episodes, total num timesteps 18400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 92/10000 episodes, total num timesteps 18600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 93/10000 episodes, total num timesteps 18800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 94/10000 episodes, total num timesteps 19000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 95/10000 episodes, total num timesteps 19200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 96/10000 episodes, total num timesteps 19400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 97/10000 episodes, total num timesteps 19600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 98/10000 episodes, total num timesteps 19800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 99/10000 episodes, total num timesteps 20000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 100/10000 episodes, total num timesteps 20200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.05136515821454953
team_policy eval average team episode rewards of agent0: 7.5
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent1: 0.0008537256897546497
team_policy eval average team episode rewards of agent1: 7.5
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent2: 0.0835553876491753
team_policy eval average team episode rewards of agent2: 7.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent3: 0.031349927120146746
team_policy eval average team episode rewards of agent3: 7.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent4: 0.151473081732647
team_policy eval average team episode rewards of agent4: 7.5
team_policy eval idv catch total num of agent4: 9
team_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent0: -0.04671241981007689
idv_policy eval average team episode rewards of agent0: 7.5
idv_policy eval idv catch total num of agent0: 1
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent1: -0.049339543787264475
idv_policy eval average team episode rewards of agent1: 7.5
idv_policy eval idv catch total num of agent1: 1
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent2: 0.017178656485969453
idv_policy eval average team episode rewards of agent2: 7.5
idv_policy eval idv catch total num of agent2: 4
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent3: -0.07445301854773227
idv_policy eval average team episode rewards of agent3: 7.5
idv_policy eval idv catch total num of agent3: 0
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent4: 0.04740634136122807
idv_policy eval average team episode rewards of agent4: 7.5
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 3

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 101/10000 episodes, total num timesteps 20400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 102/10000 episodes, total num timesteps 20600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 103/10000 episodes, total num timesteps 20800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 104/10000 episodes, total num timesteps 21000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 105/10000 episodes, total num timesteps 21200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 106/10000 episodes, total num timesteps 21400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 107/10000 episodes, total num timesteps 21600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 108/10000 episodes, total num timesteps 21800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 109/10000 episodes, total num timesteps 22000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 110/10000 episodes, total num timesteps 22200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 111/10000 episodes, total num timesteps 22400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 112/10000 episodes, total num timesteps 22600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 113/10000 episodes, total num timesteps 22800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 114/10000 episodes, total num timesteps 23000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 115/10000 episodes, total num timesteps 23200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 116/10000 episodes, total num timesteps 23400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 117/10000 episodes, total num timesteps 23600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 118/10000 episodes, total num timesteps 23800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 119/10000 episodes, total num timesteps 24000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 120/10000 episodes, total num timesteps 24200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 121/10000 episodes, total num timesteps 24400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 122/10000 episodes, total num timesteps 24600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 123/10000 episodes, total num timesteps 24800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 124/10000 episodes, total num timesteps 25000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 125/10000 episodes, total num timesteps 25200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.16058097179513792
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.16082957390692004
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 9
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.15981069561211814
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.05889940681067696
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: -0.015377718200532926
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: -0.07743596809394518
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 0
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: -0.02525528762316894
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: -0.0017707668046934666
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: -0.022324143992966717
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: -0.07628422053822889
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 0
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 126/10000 episodes, total num timesteps 25400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 127/10000 episodes, total num timesteps 25600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 128/10000 episodes, total num timesteps 25800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 129/10000 episodes, total num timesteps 26000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 130/10000 episodes, total num timesteps 26200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 131/10000 episodes, total num timesteps 26400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 132/10000 episodes, total num timesteps 26600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 133/10000 episodes, total num timesteps 26800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 134/10000 episodes, total num timesteps 27000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 135/10000 episodes, total num timesteps 27200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 136/10000 episodes, total num timesteps 27400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 137/10000 episodes, total num timesteps 27600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 138/10000 episodes, total num timesteps 27800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 139/10000 episodes, total num timesteps 28000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 140/10000 episodes, total num timesteps 28200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 141/10000 episodes, total num timesteps 28400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 142/10000 episodes, total num timesteps 28600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 143/10000 episodes, total num timesteps 28800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 144/10000 episodes, total num timesteps 29000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 145/10000 episodes, total num timesteps 29200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 146/10000 episodes, total num timesteps 29400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 147/10000 episodes, total num timesteps 29600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 148/10000 episodes, total num timesteps 29800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 149/10000 episodes, total num timesteps 30000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 150/10000 episodes, total num timesteps 30200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.05138791644412997
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: -0.0010055397801599698
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: 0.005756501773430711
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: 0.029835354287657108
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: -0.04522734097627698
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 1
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: -0.03467679102908576
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 1
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: 0.007925388295850579
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: -0.06933635259987661
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: 0.05658050423324846
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 5
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: 0.091750922535679
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 151/10000 episodes, total num timesteps 30400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 152/10000 episodes, total num timesteps 30600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 153/10000 episodes, total num timesteps 30800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 154/10000 episodes, total num timesteps 31000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 155/10000 episodes, total num timesteps 31200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 156/10000 episodes, total num timesteps 31400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 157/10000 episodes, total num timesteps 31600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 158/10000 episodes, total num timesteps 31800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 159/10000 episodes, total num timesteps 32000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 160/10000 episodes, total num timesteps 32200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 161/10000 episodes, total num timesteps 32400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 162/10000 episodes, total num timesteps 32600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 163/10000 episodes, total num timesteps 32800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 164/10000 episodes, total num timesteps 33000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 165/10000 episodes, total num timesteps 33200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 166/10000 episodes, total num timesteps 33400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 167/10000 episodes, total num timesteps 33600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 168/10000 episodes, total num timesteps 33800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 169/10000 episodes, total num timesteps 34000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 170/10000 episodes, total num timesteps 34200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 171/10000 episodes, total num timesteps 34400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 172/10000 episodes, total num timesteps 34600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 173/10000 episodes, total num timesteps 34800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 174/10000 episodes, total num timesteps 35000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 175/10000 episodes, total num timesteps 35200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.16556195981247682
team_policy eval average team episode rewards of agent0: 10.0
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent1: 0.06189660559916026
team_policy eval average team episode rewards of agent1: 10.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent2: 0.08489784502209806
team_policy eval average team episode rewards of agent2: 10.0
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent3: 0.06430515459149592
team_policy eval average team episode rewards of agent3: 10.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent4: 0.11343870079354705
team_policy eval average team episode rewards of agent4: 10.0
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent0: 0.16525337995096606
idv_policy eval average team episode rewards of agent0: 30.0
idv_policy eval idv catch total num of agent0: 9
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent1: 0.14390232114354373
idv_policy eval average team episode rewards of agent1: 30.0
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent2: 0.24074511848237531
idv_policy eval average team episode rewards of agent2: 30.0
idv_policy eval idv catch total num of agent2: 12
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent3: 0.08819491666440848
idv_policy eval average team episode rewards of agent3: 30.0
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent4: 0.3458684238772011
idv_policy eval average team episode rewards of agent4: 30.0
idv_policy eval idv catch total num of agent4: 16
idv_policy eval team catch total num: 12

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 176/10000 episodes, total num timesteps 35400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 177/10000 episodes, total num timesteps 35600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 178/10000 episodes, total num timesteps 35800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 179/10000 episodes, total num timesteps 36000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 180/10000 episodes, total num timesteps 36200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 181/10000 episodes, total num timesteps 36400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 182/10000 episodes, total num timesteps 36600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 183/10000 episodes, total num timesteps 36800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 184/10000 episodes, total num timesteps 37000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 185/10000 episodes, total num timesteps 37200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 186/10000 episodes, total num timesteps 37400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 187/10000 episodes, total num timesteps 37600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 188/10000 episodes, total num timesteps 37800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 189/10000 episodes, total num timesteps 38000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 190/10000 episodes, total num timesteps 38200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 191/10000 episodes, total num timesteps 38400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 192/10000 episodes, total num timesteps 38600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 193/10000 episodes, total num timesteps 38800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 194/10000 episodes, total num timesteps 39000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 195/10000 episodes, total num timesteps 39200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 196/10000 episodes, total num timesteps 39400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 197/10000 episodes, total num timesteps 39600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 198/10000 episodes, total num timesteps 39800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 199/10000 episodes, total num timesteps 40000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 200/10000 episodes, total num timesteps 40200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.01237288661346483
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.1364696897467111
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.15903000571345338
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.036506476585305696
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: 0.06798258906541609
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: 0.23741081645285916
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: -0.047552490195323364
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 1
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: -0.022758627848855476
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: -0.03816069140879362
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: 0.15232372492872867
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 201/10000 episodes, total num timesteps 40400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 202/10000 episodes, total num timesteps 40600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 203/10000 episodes, total num timesteps 40800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 204/10000 episodes, total num timesteps 41000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 205/10000 episodes, total num timesteps 41200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 206/10000 episodes, total num timesteps 41400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 207/10000 episodes, total num timesteps 41600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 208/10000 episodes, total num timesteps 41800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 209/10000 episodes, total num timesteps 42000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 210/10000 episodes, total num timesteps 42200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 211/10000 episodes, total num timesteps 42400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 212/10000 episodes, total num timesteps 42600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 213/10000 episodes, total num timesteps 42800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 214/10000 episodes, total num timesteps 43000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 215/10000 episodes, total num timesteps 43200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 216/10000 episodes, total num timesteps 43400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 217/10000 episodes, total num timesteps 43600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 218/10000 episodes, total num timesteps 43800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 219/10000 episodes, total num timesteps 44000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 220/10000 episodes, total num timesteps 44200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 221/10000 episodes, total num timesteps 44400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 222/10000 episodes, total num timesteps 44600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 223/10000 episodes, total num timesteps 44800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 224/10000 episodes, total num timesteps 45000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 225/10000 episodes, total num timesteps 45200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: 0.06630821978198405
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: -0.06718327311134062
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 0
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: 0.06477094862158647
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: 0.09062304366145416
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: -0.008484674367469166
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: 0.06409691664563404
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.0640326364024641
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: -0.009331160918804616
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.04389246790107403
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.11744750539942249
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 226/10000 episodes, total num timesteps 45400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 227/10000 episodes, total num timesteps 45600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 228/10000 episodes, total num timesteps 45800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 229/10000 episodes, total num timesteps 46000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 230/10000 episodes, total num timesteps 46200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 231/10000 episodes, total num timesteps 46400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 232/10000 episodes, total num timesteps 46600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 233/10000 episodes, total num timesteps 46800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 234/10000 episodes, total num timesteps 47000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 235/10000 episodes, total num timesteps 47200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 236/10000 episodes, total num timesteps 47400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 237/10000 episodes, total num timesteps 47600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 238/10000 episodes, total num timesteps 47800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 239/10000 episodes, total num timesteps 48000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 240/10000 episodes, total num timesteps 48200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 241/10000 episodes, total num timesteps 48400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 242/10000 episodes, total num timesteps 48600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 243/10000 episodes, total num timesteps 48800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 244/10000 episodes, total num timesteps 49000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 245/10000 episodes, total num timesteps 49200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 246/10000 episodes, total num timesteps 49400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 247/10000 episodes, total num timesteps 49600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 248/10000 episodes, total num timesteps 49800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 249/10000 episodes, total num timesteps 50000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 250/10000 episodes, total num timesteps 50200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.06572789199568006
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.1103667852307386
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.011150855579565046
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: -0.03982290972268357
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: -0.037130287752137635
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 1
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.10597323370862455
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.21452378402892713
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 11
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: 0.006928243075810241
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.13615718029910887
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 8
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.024461224257636176
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 251/10000 episodes, total num timesteps 50400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 252/10000 episodes, total num timesteps 50600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 253/10000 episodes, total num timesteps 50800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 254/10000 episodes, total num timesteps 51000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 255/10000 episodes, total num timesteps 51200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 256/10000 episodes, total num timesteps 51400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 257/10000 episodes, total num timesteps 51600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 258/10000 episodes, total num timesteps 51800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 259/10000 episodes, total num timesteps 52000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 260/10000 episodes, total num timesteps 52200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 261/10000 episodes, total num timesteps 52400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 262/10000 episodes, total num timesteps 52600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 263/10000 episodes, total num timesteps 52800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 264/10000 episodes, total num timesteps 53000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 265/10000 episodes, total num timesteps 53200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 266/10000 episodes, total num timesteps 53400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 267/10000 episodes, total num timesteps 53600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 268/10000 episodes, total num timesteps 53800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 269/10000 episodes, total num timesteps 54000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 270/10000 episodes, total num timesteps 54200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 271/10000 episodes, total num timesteps 54400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 272/10000 episodes, total num timesteps 54600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 273/10000 episodes, total num timesteps 54800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 274/10000 episodes, total num timesteps 55000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 275/10000 episodes, total num timesteps 55200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.14737119091883077
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 8
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.11981804099565778
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.16997167611956374
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 9
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.2961748700032883
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 14
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.21913437990393134
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.06757651701125084
idv_policy eval average team episode rewards of agent0: 17.5
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent1: 0.2199904108464549
idv_policy eval average team episode rewards of agent1: 17.5
idv_policy eval idv catch total num of agent1: 11
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent2: 0.09454993806708423
idv_policy eval average team episode rewards of agent2: 17.5
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent3: 0.04526094997565938
idv_policy eval average team episode rewards of agent3: 17.5
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent4: 0.09307489805969968
idv_policy eval average team episode rewards of agent4: 17.5
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 7

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 276/10000 episodes, total num timesteps 55400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 277/10000 episodes, total num timesteps 55600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 278/10000 episodes, total num timesteps 55800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 279/10000 episodes, total num timesteps 56000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 280/10000 episodes, total num timesteps 56200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 281/10000 episodes, total num timesteps 56400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 282/10000 episodes, total num timesteps 56600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 283/10000 episodes, total num timesteps 56800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 284/10000 episodes, total num timesteps 57000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 285/10000 episodes, total num timesteps 57200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 286/10000 episodes, total num timesteps 57400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 287/10000 episodes, total num timesteps 57600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 288/10000 episodes, total num timesteps 57800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 289/10000 episodes, total num timesteps 58000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 290/10000 episodes, total num timesteps 58200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 291/10000 episodes, total num timesteps 58400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 292/10000 episodes, total num timesteps 58600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 293/10000 episodes, total num timesteps 58800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 294/10000 episodes, total num timesteps 59000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 295/10000 episodes, total num timesteps 59200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 296/10000 episodes, total num timesteps 59400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 297/10000 episodes, total num timesteps 59600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 298/10000 episodes, total num timesteps 59800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 299/10000 episodes, total num timesteps 60000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 300/10000 episodes, total num timesteps 60200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.06868965340590537
team_policy eval average team episode rewards of agent0: 2.5
team_policy eval idv catch total num of agent0: 0
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent1: 0.1131484098862214
team_policy eval average team episode rewards of agent1: 2.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent2: 0.03592741860572755
team_policy eval average team episode rewards of agent2: 2.5
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent3: 0.03553286824847187
team_policy eval average team episode rewards of agent3: 2.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent4: -0.04030911800004938
team_policy eval average team episode rewards of agent4: 2.5
team_policy eval idv catch total num of agent4: 1
team_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent0: 0.21391906998173837
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.21799651730105993
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 11
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.13890305482437917
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.29028363970103527
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 14
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.06121884988646641
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 301/10000 episodes, total num timesteps 60400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 302/10000 episodes, total num timesteps 60600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 303/10000 episodes, total num timesteps 60800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 304/10000 episodes, total num timesteps 61000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 305/10000 episodes, total num timesteps 61200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 306/10000 episodes, total num timesteps 61400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 307/10000 episodes, total num timesteps 61600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 308/10000 episodes, total num timesteps 61800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 309/10000 episodes, total num timesteps 62000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 310/10000 episodes, total num timesteps 62200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 311/10000 episodes, total num timesteps 62400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 312/10000 episodes, total num timesteps 62600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 313/10000 episodes, total num timesteps 62800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 314/10000 episodes, total num timesteps 63000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 315/10000 episodes, total num timesteps 63200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 316/10000 episodes, total num timesteps 63400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 317/10000 episodes, total num timesteps 63600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 318/10000 episodes, total num timesteps 63800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 319/10000 episodes, total num timesteps 64000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 320/10000 episodes, total num timesteps 64200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 321/10000 episodes, total num timesteps 64400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 322/10000 episodes, total num timesteps 64600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 323/10000 episodes, total num timesteps 64800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 324/10000 episodes, total num timesteps 65000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 325/10000 episodes, total num timesteps 65200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.017692465943462958
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.11486820170042072
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.03604329984715818
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.13330185332475827
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: 0.18758280878558245
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: 0.06855734931020585
idv_policy eval average team episode rewards of agent0: 0.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent1: 0.09556051535216956
idv_policy eval average team episode rewards of agent1: 0.0
idv_policy eval idv catch total num of agent1: 6
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent2: -0.004517480748195257
idv_policy eval average team episode rewards of agent2: 0.0
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent3: -0.033188665783861634
idv_policy eval average team episode rewards of agent3: 0.0
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent4: 0.04679493833282963
idv_policy eval average team episode rewards of agent4: 0.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 0

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 326/10000 episodes, total num timesteps 65400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 327/10000 episodes, total num timesteps 65600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 328/10000 episodes, total num timesteps 65800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 329/10000 episodes, total num timesteps 66000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 330/10000 episodes, total num timesteps 66200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 331/10000 episodes, total num timesteps 66400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 332/10000 episodes, total num timesteps 66600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 333/10000 episodes, total num timesteps 66800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 334/10000 episodes, total num timesteps 67000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 335/10000 episodes, total num timesteps 67200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 336/10000 episodes, total num timesteps 67400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 337/10000 episodes, total num timesteps 67600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 338/10000 episodes, total num timesteps 67800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 339/10000 episodes, total num timesteps 68000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 340/10000 episodes, total num timesteps 68200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 341/10000 episodes, total num timesteps 68400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 342/10000 episodes, total num timesteps 68600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 343/10000 episodes, total num timesteps 68800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 344/10000 episodes, total num timesteps 69000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 345/10000 episodes, total num timesteps 69200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 346/10000 episodes, total num timesteps 69400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 347/10000 episodes, total num timesteps 69600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 348/10000 episodes, total num timesteps 69800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 349/10000 episodes, total num timesteps 70000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 350/10000 episodes, total num timesteps 70200/2000000, FPS 177.

team_policy eval average step individual rewards of agent0: -0.047125961382483206
team_policy eval average team episode rewards of agent0: 2.5
team_policy eval idv catch total num of agent0: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent1: -0.018091305502753056
team_policy eval average team episode rewards of agent1: 2.5
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent2: -0.022501165909451944
team_policy eval average team episode rewards of agent2: 2.5
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent3: -0.04987438117751109
team_policy eval average team episode rewards of agent3: 2.5
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent4: -0.019333068868765402
team_policy eval average team episode rewards of agent4: 2.5
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent0: 0.12419755386748062
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 8
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.10526457948603025
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 7
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.15478160710883315
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 9
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.11479253394291003
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.16274372076592086
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 351/10000 episodes, total num timesteps 70400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 352/10000 episodes, total num timesteps 70600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 353/10000 episodes, total num timesteps 70800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 354/10000 episodes, total num timesteps 71000/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 355/10000 episodes, total num timesteps 71200/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 356/10000 episodes, total num timesteps 71400/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 357/10000 episodes, total num timesteps 71600/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 358/10000 episodes, total num timesteps 71800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 359/10000 episodes, total num timesteps 72000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 360/10000 episodes, total num timesteps 72200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 361/10000 episodes, total num timesteps 72400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 362/10000 episodes, total num timesteps 72600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 363/10000 episodes, total num timesteps 72800/2000000, FPS 177.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 364/10000 episodes, total num timesteps 73000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 365/10000 episodes, total num timesteps 73200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 366/10000 episodes, total num timesteps 73400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 367/10000 episodes, total num timesteps 73600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 368/10000 episodes, total num timesteps 73800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 369/10000 episodes, total num timesteps 74000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 370/10000 episodes, total num timesteps 74200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 371/10000 episodes, total num timesteps 74400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 372/10000 episodes, total num timesteps 74600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 373/10000 episodes, total num timesteps 74800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 374/10000 episodes, total num timesteps 75000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 375/10000 episodes, total num timesteps 75200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08839900794904496
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.08372046891743751
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.03682469570520959
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.031956277206813356
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.01315461123685156
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 3
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.1180713137518571
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: 0.05979247478645516
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: 0.0679190915222553
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: 0.05823961784548002
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: 0.06733257777575233
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 376/10000 episodes, total num timesteps 75400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 377/10000 episodes, total num timesteps 75600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 378/10000 episodes, total num timesteps 75800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 379/10000 episodes, total num timesteps 76000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 380/10000 episodes, total num timesteps 76200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 381/10000 episodes, total num timesteps 76400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 382/10000 episodes, total num timesteps 76600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 383/10000 episodes, total num timesteps 76800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 384/10000 episodes, total num timesteps 77000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 385/10000 episodes, total num timesteps 77200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 386/10000 episodes, total num timesteps 77400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 387/10000 episodes, total num timesteps 77600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 388/10000 episodes, total num timesteps 77800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 389/10000 episodes, total num timesteps 78000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 390/10000 episodes, total num timesteps 78200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 391/10000 episodes, total num timesteps 78400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 392/10000 episodes, total num timesteps 78600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 393/10000 episodes, total num timesteps 78800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 394/10000 episodes, total num timesteps 79000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 395/10000 episodes, total num timesteps 79200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 396/10000 episodes, total num timesteps 79400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 397/10000 episodes, total num timesteps 79600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 398/10000 episodes, total num timesteps 79800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 399/10000 episodes, total num timesteps 80000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 400/10000 episodes, total num timesteps 80200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.06167702988941715
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.06319033684551086
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.29492671627142714
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 14
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.007960717739094453
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: 0.0838956360941524
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: 0.25918039000221155
idv_policy eval average team episode rewards of agent0: 30.0
idv_policy eval idv catch total num of agent0: 13
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent1: 0.13326817865053622
idv_policy eval average team episode rewards of agent1: 30.0
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent2: 0.1014408349684479
idv_policy eval average team episode rewards of agent2: 30.0
idv_policy eval idv catch total num of agent2: 7
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent3: 0.10878911729977159
idv_policy eval average team episode rewards of agent3: 30.0
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent4: 0.31034485734329925
idv_policy eval average team episode rewards of agent4: 30.0
idv_policy eval idv catch total num of agent4: 15
idv_policy eval team catch total num: 12

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 401/10000 episodes, total num timesteps 80400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 402/10000 episodes, total num timesteps 80600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 403/10000 episodes, total num timesteps 80800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 404/10000 episodes, total num timesteps 81000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 405/10000 episodes, total num timesteps 81200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 406/10000 episodes, total num timesteps 81400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 407/10000 episodes, total num timesteps 81600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 408/10000 episodes, total num timesteps 81800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 409/10000 episodes, total num timesteps 82000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 410/10000 episodes, total num timesteps 82200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 411/10000 episodes, total num timesteps 82400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 412/10000 episodes, total num timesteps 82600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 413/10000 episodes, total num timesteps 82800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 414/10000 episodes, total num timesteps 83000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 415/10000 episodes, total num timesteps 83200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 416/10000 episodes, total num timesteps 83400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 417/10000 episodes, total num timesteps 83600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 418/10000 episodes, total num timesteps 83800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 419/10000 episodes, total num timesteps 84000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 420/10000 episodes, total num timesteps 84200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 421/10000 episodes, total num timesteps 84400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 422/10000 episodes, total num timesteps 84600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 423/10000 episodes, total num timesteps 84800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 424/10000 episodes, total num timesteps 85000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 425/10000 episodes, total num timesteps 85200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.06730696634120113
team_policy eval average team episode rewards of agent0: 30.0
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent1: 0.30196586273606435
team_policy eval average team episode rewards of agent1: 30.0
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent2: 0.29582546171690977
team_policy eval average team episode rewards of agent2: 30.0
team_policy eval idv catch total num of agent2: 14
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent3: 0.21422104901992448
team_policy eval average team episode rewards of agent3: 30.0
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent4: 0.09116929549157377
team_policy eval average team episode rewards of agent4: 30.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent0: 0.05542993915176563
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.13797339243979023
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.1129468586667101
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 7
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.1899335280871448
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 10
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.1566306811679877
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 426/10000 episodes, total num timesteps 85400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 427/10000 episodes, total num timesteps 85600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 428/10000 episodes, total num timesteps 85800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 429/10000 episodes, total num timesteps 86000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 430/10000 episodes, total num timesteps 86200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 431/10000 episodes, total num timesteps 86400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 432/10000 episodes, total num timesteps 86600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 433/10000 episodes, total num timesteps 86800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 434/10000 episodes, total num timesteps 87000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 435/10000 episodes, total num timesteps 87200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 436/10000 episodes, total num timesteps 87400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 437/10000 episodes, total num timesteps 87600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 438/10000 episodes, total num timesteps 87800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 439/10000 episodes, total num timesteps 88000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 440/10000 episodes, total num timesteps 88200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 441/10000 episodes, total num timesteps 88400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 442/10000 episodes, total num timesteps 88600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 443/10000 episodes, total num timesteps 88800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 444/10000 episodes, total num timesteps 89000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 445/10000 episodes, total num timesteps 89200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 446/10000 episodes, total num timesteps 89400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 447/10000 episodes, total num timesteps 89600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 448/10000 episodes, total num timesteps 89800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 449/10000 episodes, total num timesteps 90000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 450/10000 episodes, total num timesteps 90200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08447056058203878
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.06099056670155734
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.08631304566175381
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.06620876510441194
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.11546064869272922
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.2326603473203658
idv_policy eval average team episode rewards of agent0: 25.0
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent1: 0.23984828030020855
idv_policy eval average team episode rewards of agent1: 25.0
idv_policy eval idv catch total num of agent1: 12
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent2: 0.19123176029692027
idv_policy eval average team episode rewards of agent2: 25.0
idv_policy eval idv catch total num of agent2: 10
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent3: 0.24193911455987727
idv_policy eval average team episode rewards of agent3: 25.0
idv_policy eval idv catch total num of agent3: 12
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent4: 0.14354470169820843
idv_policy eval average team episode rewards of agent4: 25.0
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 10

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 451/10000 episodes, total num timesteps 90400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 452/10000 episodes, total num timesteps 90600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 453/10000 episodes, total num timesteps 90800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 454/10000 episodes, total num timesteps 91000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 455/10000 episodes, total num timesteps 91200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 456/10000 episodes, total num timesteps 91400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 457/10000 episodes, total num timesteps 91600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 458/10000 episodes, total num timesteps 91800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 459/10000 episodes, total num timesteps 92000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 460/10000 episodes, total num timesteps 92200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 461/10000 episodes, total num timesteps 92400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 462/10000 episodes, total num timesteps 92600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 463/10000 episodes, total num timesteps 92800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 464/10000 episodes, total num timesteps 93000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 465/10000 episodes, total num timesteps 93200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 466/10000 episodes, total num timesteps 93400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 467/10000 episodes, total num timesteps 93600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 468/10000 episodes, total num timesteps 93800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 469/10000 episodes, total num timesteps 94000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 470/10000 episodes, total num timesteps 94200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 471/10000 episodes, total num timesteps 94400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 472/10000 episodes, total num timesteps 94600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 473/10000 episodes, total num timesteps 94800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 474/10000 episodes, total num timesteps 95000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 475/10000 episodes, total num timesteps 95200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.04249842634563622
team_policy eval average team episode rewards of agent0: 10.0
team_policy eval idv catch total num of agent0: 4
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent1: 0.0693042136097856
team_policy eval average team episode rewards of agent1: 10.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent2: 0.015788749747889137
team_policy eval average team episode rewards of agent2: 10.0
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent3: 0.04367726380569424
team_policy eval average team episode rewards of agent3: 10.0
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent4: 0.04074922381660469
team_policy eval average team episode rewards of agent4: 10.0
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent0: 0.27300211559371396
idv_policy eval average team episode rewards of agent0: 32.5
idv_policy eval idv catch total num of agent0: 13
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent1: 0.1225512684951168
idv_policy eval average team episode rewards of agent1: 32.5
idv_policy eval idv catch total num of agent1: 7
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent2: 0.19856675734008597
idv_policy eval average team episode rewards of agent2: 32.5
idv_policy eval idv catch total num of agent2: 10
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent3: 0.14617045590821856
idv_policy eval average team episode rewards of agent3: 32.5
idv_policy eval idv catch total num of agent3: 8
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent4: 0.25041640956445804
idv_policy eval average team episode rewards of agent4: 32.5
idv_policy eval idv catch total num of agent4: 12
idv_policy eval team catch total num: 13

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 476/10000 episodes, total num timesteps 95400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 477/10000 episodes, total num timesteps 95600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 478/10000 episodes, total num timesteps 95800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 479/10000 episodes, total num timesteps 96000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 480/10000 episodes, total num timesteps 96200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 481/10000 episodes, total num timesteps 96400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 482/10000 episodes, total num timesteps 96600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 483/10000 episodes, total num timesteps 96800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 484/10000 episodes, total num timesteps 97000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 485/10000 episodes, total num timesteps 97200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 486/10000 episodes, total num timesteps 97400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 487/10000 episodes, total num timesteps 97600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 488/10000 episodes, total num timesteps 97800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 489/10000 episodes, total num timesteps 98000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 490/10000 episodes, total num timesteps 98200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 491/10000 episodes, total num timesteps 98400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 492/10000 episodes, total num timesteps 98600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 493/10000 episodes, total num timesteps 98800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 494/10000 episodes, total num timesteps 99000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 495/10000 episodes, total num timesteps 99200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 496/10000 episodes, total num timesteps 99400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 497/10000 episodes, total num timesteps 99600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 498/10000 episodes, total num timesteps 99800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 499/10000 episodes, total num timesteps 100000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 500/10000 episodes, total num timesteps 100200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.2615988604826784
team_policy eval average team episode rewards of agent0: 25.0
team_policy eval idv catch total num of agent0: 13
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent1: 0.12170405776828513
team_policy eval average team episode rewards of agent1: 25.0
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent2: 0.09146983699659401
team_policy eval average team episode rewards of agent2: 25.0
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent3: 0.21952680340044825
team_policy eval average team episode rewards of agent3: 25.0
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent4: 0.12078022409482778
team_policy eval average team episode rewards of agent4: 25.0
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent0: 0.04051480333041527
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: -0.009147928331167654
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: 0.009105010865648024
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: 0.035056674789220296
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: 0.0125518640863649
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 3
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 501/10000 episodes, total num timesteps 100400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 502/10000 episodes, total num timesteps 100600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 503/10000 episodes, total num timesteps 100800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 504/10000 episodes, total num timesteps 101000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 505/10000 episodes, total num timesteps 101200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 506/10000 episodes, total num timesteps 101400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 507/10000 episodes, total num timesteps 101600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 508/10000 episodes, total num timesteps 101800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 509/10000 episodes, total num timesteps 102000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 510/10000 episodes, total num timesteps 102200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 511/10000 episodes, total num timesteps 102400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 512/10000 episodes, total num timesteps 102600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 513/10000 episodes, total num timesteps 102800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 514/10000 episodes, total num timesteps 103000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 515/10000 episodes, total num timesteps 103200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 516/10000 episodes, total num timesteps 103400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 517/10000 episodes, total num timesteps 103600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 518/10000 episodes, total num timesteps 103800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 519/10000 episodes, total num timesteps 104000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 520/10000 episodes, total num timesteps 104200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 521/10000 episodes, total num timesteps 104400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 522/10000 episodes, total num timesteps 104600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 523/10000 episodes, total num timesteps 104800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 524/10000 episodes, total num timesteps 105000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 525/10000 episodes, total num timesteps 105200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.03195082400873624
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: 0.01957762528323687
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: -0.03728084900316839
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: -0.03536427409580197
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: 0.09388456549269432
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: 0.044069298251999296
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: -0.005620188666349675
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: -0.004998499501731577
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: 0.07307523101526328
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: 0.020605598075420666
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 526/10000 episodes, total num timesteps 105400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 527/10000 episodes, total num timesteps 105600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 528/10000 episodes, total num timesteps 105800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 529/10000 episodes, total num timesteps 106000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 530/10000 episodes, total num timesteps 106200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 531/10000 episodes, total num timesteps 106400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 532/10000 episodes, total num timesteps 106600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 533/10000 episodes, total num timesteps 106800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 534/10000 episodes, total num timesteps 107000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 535/10000 episodes, total num timesteps 107200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 536/10000 episodes, total num timesteps 107400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 537/10000 episodes, total num timesteps 107600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 538/10000 episodes, total num timesteps 107800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 539/10000 episodes, total num timesteps 108000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 540/10000 episodes, total num timesteps 108200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 541/10000 episodes, total num timesteps 108400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 542/10000 episodes, total num timesteps 108600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 543/10000 episodes, total num timesteps 108800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 544/10000 episodes, total num timesteps 109000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 545/10000 episodes, total num timesteps 109200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 546/10000 episodes, total num timesteps 109400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 547/10000 episodes, total num timesteps 109600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 548/10000 episodes, total num timesteps 109800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 549/10000 episodes, total num timesteps 110000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 550/10000 episodes, total num timesteps 110200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.011693202110144223
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.054988513300846344
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.08445833883477961
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.11966857193189183
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 7
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.1360579164475846
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 8
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: -0.001974505739901522
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 2
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: 0.06826501339223286
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: 0.09594874654557867
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: 0.06921743679740051
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: 0.050535289674071976
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 551/10000 episodes, total num timesteps 110400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 552/10000 episodes, total num timesteps 110600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 553/10000 episodes, total num timesteps 110800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 554/10000 episodes, total num timesteps 111000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 555/10000 episodes, total num timesteps 111200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 556/10000 episodes, total num timesteps 111400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 557/10000 episodes, total num timesteps 111600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 558/10000 episodes, total num timesteps 111800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 559/10000 episodes, total num timesteps 112000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 560/10000 episodes, total num timesteps 112200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 561/10000 episodes, total num timesteps 112400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 562/10000 episodes, total num timesteps 112600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 563/10000 episodes, total num timesteps 112800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 564/10000 episodes, total num timesteps 113000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 565/10000 episodes, total num timesteps 113200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 566/10000 episodes, total num timesteps 113400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 567/10000 episodes, total num timesteps 113600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 568/10000 episodes, total num timesteps 113800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 569/10000 episodes, total num timesteps 114000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 570/10000 episodes, total num timesteps 114200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 571/10000 episodes, total num timesteps 114400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 572/10000 episodes, total num timesteps 114600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 573/10000 episodes, total num timesteps 114800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 574/10000 episodes, total num timesteps 115000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 575/10000 episodes, total num timesteps 115200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.1277774287218477
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 8
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.15773254873030912
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 9
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.1274089469211634
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.20331124975529685
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 11
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.20799022905639625
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.05697496570019882
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.03131604816334785
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.06032115300392542
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.08399981629792784
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.157165738031993
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 9
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 576/10000 episodes, total num timesteps 115400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 577/10000 episodes, total num timesteps 115600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 578/10000 episodes, total num timesteps 115800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 579/10000 episodes, total num timesteps 116000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 580/10000 episodes, total num timesteps 116200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 581/10000 episodes, total num timesteps 116400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 582/10000 episodes, total num timesteps 116600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 583/10000 episodes, total num timesteps 116800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 584/10000 episodes, total num timesteps 117000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 585/10000 episodes, total num timesteps 117200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 586/10000 episodes, total num timesteps 117400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 587/10000 episodes, total num timesteps 117600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 588/10000 episodes, total num timesteps 117800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 589/10000 episodes, total num timesteps 118000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 590/10000 episodes, total num timesteps 118200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 591/10000 episodes, total num timesteps 118400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 592/10000 episodes, total num timesteps 118600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 593/10000 episodes, total num timesteps 118800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 594/10000 episodes, total num timesteps 119000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 595/10000 episodes, total num timesteps 119200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 596/10000 episodes, total num timesteps 119400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 597/10000 episodes, total num timesteps 119600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 598/10000 episodes, total num timesteps 119800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 599/10000 episodes, total num timesteps 120000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 600/10000 episodes, total num timesteps 120200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.10388313600104017
team_policy eval average team episode rewards of agent0: 10.0
team_policy eval idv catch total num of agent0: 7
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent1: 0.00782715769084822
team_policy eval average team episode rewards of agent1: 10.0
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent2: 0.1907511611439979
team_policy eval average team episode rewards of agent2: 10.0
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent3: 0.06448097057372439
team_policy eval average team episode rewards of agent3: 10.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent4: 0.06065265077322028
team_policy eval average team episode rewards of agent4: 10.0
team_policy eval idv catch total num of agent4: 5
team_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent0: 0.13465197041772187
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 8
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.06768767746395482
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: 0.03949435112260013
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 4
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.03695989723229713
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.11989930348993201
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 601/10000 episodes, total num timesteps 120400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 602/10000 episodes, total num timesteps 120600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 603/10000 episodes, total num timesteps 120800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 604/10000 episodes, total num timesteps 121000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 605/10000 episodes, total num timesteps 121200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 606/10000 episodes, total num timesteps 121400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 607/10000 episodes, total num timesteps 121600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 608/10000 episodes, total num timesteps 121800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 609/10000 episodes, total num timesteps 122000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 610/10000 episodes, total num timesteps 122200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 611/10000 episodes, total num timesteps 122400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 612/10000 episodes, total num timesteps 122600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 613/10000 episodes, total num timesteps 122800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 614/10000 episodes, total num timesteps 123000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 615/10000 episodes, total num timesteps 123200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 616/10000 episodes, total num timesteps 123400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 617/10000 episodes, total num timesteps 123600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 618/10000 episodes, total num timesteps 123800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 619/10000 episodes, total num timesteps 124000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 620/10000 episodes, total num timesteps 124200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 621/10000 episodes, total num timesteps 124400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 622/10000 episodes, total num timesteps 124600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 623/10000 episodes, total num timesteps 124800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 624/10000 episodes, total num timesteps 125000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 625/10000 episodes, total num timesteps 125200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.32544488774738256
team_policy eval average team episode rewards of agent0: 30.0
team_policy eval idv catch total num of agent0: 15
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent1: 0.17069497001377149
team_policy eval average team episode rewards of agent1: 30.0
team_policy eval idv catch total num of agent1: 9
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent2: 0.04690109850696636
team_policy eval average team episode rewards of agent2: 30.0
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent3: 0.2503983769829654
team_policy eval average team episode rewards of agent3: 30.0
team_policy eval idv catch total num of agent3: 12
team_policy eval team catch total num: 12
team_policy eval average step individual rewards of agent4: 0.3224571413736817
team_policy eval average team episode rewards of agent4: 30.0
team_policy eval idv catch total num of agent4: 15
team_policy eval team catch total num: 12
idv_policy eval average step individual rewards of agent0: 0.11718942055310493
idv_policy eval average team episode rewards of agent0: 25.0
idv_policy eval idv catch total num of agent0: 7
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent1: 0.08778270820177408
idv_policy eval average team episode rewards of agent1: 25.0
idv_policy eval idv catch total num of agent1: 6
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent2: 0.2193195279116183
idv_policy eval average team episode rewards of agent2: 25.0
idv_policy eval idv catch total num of agent2: 11
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent3: 0.18925096942254324
idv_policy eval average team episode rewards of agent3: 25.0
idv_policy eval idv catch total num of agent3: 10
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent4: 0.110727783336094
idv_policy eval average team episode rewards of agent4: 25.0
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 10

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 626/10000 episodes, total num timesteps 125400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 627/10000 episodes, total num timesteps 125600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 628/10000 episodes, total num timesteps 125800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 629/10000 episodes, total num timesteps 126000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 630/10000 episodes, total num timesteps 126200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 631/10000 episodes, total num timesteps 126400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 632/10000 episodes, total num timesteps 126600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 633/10000 episodes, total num timesteps 126800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 634/10000 episodes, total num timesteps 127000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 635/10000 episodes, total num timesteps 127200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 636/10000 episodes, total num timesteps 127400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 637/10000 episodes, total num timesteps 127600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 638/10000 episodes, total num timesteps 127800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 639/10000 episodes, total num timesteps 128000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 640/10000 episodes, total num timesteps 128200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 641/10000 episodes, total num timesteps 128400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 642/10000 episodes, total num timesteps 128600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 643/10000 episodes, total num timesteps 128800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 644/10000 episodes, total num timesteps 129000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 645/10000 episodes, total num timesteps 129200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 646/10000 episodes, total num timesteps 129400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 647/10000 episodes, total num timesteps 129600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 648/10000 episodes, total num timesteps 129800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 649/10000 episodes, total num timesteps 130000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 650/10000 episodes, total num timesteps 130200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.07195726377283097
team_policy eval average team episode rewards of agent0: 5.0
team_policy eval idv catch total num of agent0: 0
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent1: -0.06959365505498784
team_policy eval average team episode rewards of agent1: 5.0
team_policy eval idv catch total num of agent1: 0
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent2: 0.006820940133968327
team_policy eval average team episode rewards of agent2: 5.0
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent3: 0.008102072559060081
team_policy eval average team episode rewards of agent3: 5.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 2
team_policy eval average step individual rewards of agent4: 0.11406542030704175
team_policy eval average team episode rewards of agent4: 5.0
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent0: 0.03397890448098824
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: 0.11570033675865918
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 7
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: -0.020352497833736695
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: -0.01317606399635714
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: 0.03702650276637789
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 651/10000 episodes, total num timesteps 130400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 652/10000 episodes, total num timesteps 130600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 653/10000 episodes, total num timesteps 130800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 654/10000 episodes, total num timesteps 131000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 655/10000 episodes, total num timesteps 131200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 656/10000 episodes, total num timesteps 131400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 657/10000 episodes, total num timesteps 131600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 658/10000 episodes, total num timesteps 131800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 659/10000 episodes, total num timesteps 132000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 660/10000 episodes, total num timesteps 132200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 661/10000 episodes, total num timesteps 132400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 662/10000 episodes, total num timesteps 132600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 663/10000 episodes, total num timesteps 132800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 664/10000 episodes, total num timesteps 133000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 665/10000 episodes, total num timesteps 133200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 666/10000 episodes, total num timesteps 133400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 667/10000 episodes, total num timesteps 133600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 668/10000 episodes, total num timesteps 133800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 669/10000 episodes, total num timesteps 134000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 670/10000 episodes, total num timesteps 134200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 671/10000 episodes, total num timesteps 134400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 672/10000 episodes, total num timesteps 134600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 673/10000 episodes, total num timesteps 134800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 674/10000 episodes, total num timesteps 135000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 675/10000 episodes, total num timesteps 135200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.0465281089497706
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 4
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.14796893463655747
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.06319046052825578
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.009729658194726066
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.172226025727476
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 9
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: -0.000968378543360755
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.0006435839470003168
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.0025701228751875173
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.07906216825302387
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.025906973303936452
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 676/10000 episodes, total num timesteps 135400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 677/10000 episodes, total num timesteps 135600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 678/10000 episodes, total num timesteps 135800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 679/10000 episodes, total num timesteps 136000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 680/10000 episodes, total num timesteps 136200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 681/10000 episodes, total num timesteps 136400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 682/10000 episodes, total num timesteps 136600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 683/10000 episodes, total num timesteps 136800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 684/10000 episodes, total num timesteps 137000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 685/10000 episodes, total num timesteps 137200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 686/10000 episodes, total num timesteps 137400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 687/10000 episodes, total num timesteps 137600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 688/10000 episodes, total num timesteps 137800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 689/10000 episodes, total num timesteps 138000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 690/10000 episodes, total num timesteps 138200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 691/10000 episodes, total num timesteps 138400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 692/10000 episodes, total num timesteps 138600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 693/10000 episodes, total num timesteps 138800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 694/10000 episodes, total num timesteps 139000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 695/10000 episodes, total num timesteps 139200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 696/10000 episodes, total num timesteps 139400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 697/10000 episodes, total num timesteps 139600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 698/10000 episodes, total num timesteps 139800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 699/10000 episodes, total num timesteps 140000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 700/10000 episodes, total num timesteps 140200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.20423743280493958
team_policy eval average team episode rewards of agent0: 27.5
team_policy eval idv catch total num of agent0: 10
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent1: 0.14297399686450718
team_policy eval average team episode rewards of agent1: 27.5
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent2: 0.09467368305192356
team_policy eval average team episode rewards of agent2: 27.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent3: 0.322678900622223
team_policy eval average team episode rewards of agent3: 27.5
team_policy eval idv catch total num of agent3: 15
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent4: 0.12117652032040445
team_policy eval average team episode rewards of agent4: 27.5
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent0: 0.16193171808383172
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 9
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.1880303703140858
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: -0.03959892622987429
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 1
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.11478086848242466
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: -0.007699757709182564
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 2
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 701/10000 episodes, total num timesteps 140400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 702/10000 episodes, total num timesteps 140600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 703/10000 episodes, total num timesteps 140800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 704/10000 episodes, total num timesteps 141000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 705/10000 episodes, total num timesteps 141200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 706/10000 episodes, total num timesteps 141400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 707/10000 episodes, total num timesteps 141600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 708/10000 episodes, total num timesteps 141800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 709/10000 episodes, total num timesteps 142000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 710/10000 episodes, total num timesteps 142200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 711/10000 episodes, total num timesteps 142400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 712/10000 episodes, total num timesteps 142600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 713/10000 episodes, total num timesteps 142800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 714/10000 episodes, total num timesteps 143000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 715/10000 episodes, total num timesteps 143200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 716/10000 episodes, total num timesteps 143400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 717/10000 episodes, total num timesteps 143600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 718/10000 episodes, total num timesteps 143800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 719/10000 episodes, total num timesteps 144000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 720/10000 episodes, total num timesteps 144200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 721/10000 episodes, total num timesteps 144400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 722/10000 episodes, total num timesteps 144600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 723/10000 episodes, total num timesteps 144800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 724/10000 episodes, total num timesteps 145000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 725/10000 episodes, total num timesteps 145200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.004713999532080546
team_policy eval average team episode rewards of agent0: 7.5
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent1: -0.03245686220301238
team_policy eval average team episode rewards of agent1: 7.5
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent2: -0.02970203516108296
team_policy eval average team episode rewards of agent2: 7.5
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent3: -0.05426410291847246
team_policy eval average team episode rewards of agent3: 7.5
team_policy eval idv catch total num of agent3: 1
team_policy eval team catch total num: 3
team_policy eval average step individual rewards of agent4: -0.034210255387070954
team_policy eval average team episode rewards of agent4: 7.5
team_policy eval idv catch total num of agent4: 2
team_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent0: 0.012155425009096965
idv_policy eval average team episode rewards of agent0: 2.5
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent1: 0.16437972574462645
idv_policy eval average team episode rewards of agent1: 2.5
idv_policy eval idv catch total num of agent1: 9
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent2: -0.06526474023859535
idv_policy eval average team episode rewards of agent2: 2.5
idv_policy eval idv catch total num of agent2: 0
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent3: -0.03754478060313518
idv_policy eval average team episode rewards of agent3: 2.5
idv_policy eval idv catch total num of agent3: 1
idv_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent4: -0.03812736282423392
idv_policy eval average team episode rewards of agent4: 2.5
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 1

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 726/10000 episodes, total num timesteps 145400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 727/10000 episodes, total num timesteps 145600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 728/10000 episodes, total num timesteps 145800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 729/10000 episodes, total num timesteps 146000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 730/10000 episodes, total num timesteps 146200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 731/10000 episodes, total num timesteps 146400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 732/10000 episodes, total num timesteps 146600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 733/10000 episodes, total num timesteps 146800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 734/10000 episodes, total num timesteps 147000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 735/10000 episodes, total num timesteps 147200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 736/10000 episodes, total num timesteps 147400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 737/10000 episodes, total num timesteps 147600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 738/10000 episodes, total num timesteps 147800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 739/10000 episodes, total num timesteps 148000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 740/10000 episodes, total num timesteps 148200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 741/10000 episodes, total num timesteps 148400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 742/10000 episodes, total num timesteps 148600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 743/10000 episodes, total num timesteps 148800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 744/10000 episodes, total num timesteps 149000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 745/10000 episodes, total num timesteps 149200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 746/10000 episodes, total num timesteps 149400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 747/10000 episodes, total num timesteps 149600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 748/10000 episodes, total num timesteps 149800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 749/10000 episodes, total num timesteps 150000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 750/10000 episodes, total num timesteps 150200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.29243298206391766
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 14
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.11385318789888074
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.1477179439580362
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.011142907225903498
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.1155028629867417
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.22685136554424268
idv_policy eval average team episode rewards of agent0: 32.5
idv_policy eval idv catch total num of agent0: 11
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent1: 0.1511773786688606
idv_policy eval average team episode rewards of agent1: 32.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent2: 0.3039080283940947
idv_policy eval average team episode rewards of agent2: 32.5
idv_policy eval idv catch total num of agent2: 14
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent3: 0.21893368188471052
idv_policy eval average team episode rewards of agent3: 32.5
idv_policy eval idv catch total num of agent3: 11
idv_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent4: 0.1258766381227489
idv_policy eval average team episode rewards of agent4: 32.5
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 13

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 751/10000 episodes, total num timesteps 150400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 752/10000 episodes, total num timesteps 150600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 753/10000 episodes, total num timesteps 150800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 754/10000 episodes, total num timesteps 151000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 755/10000 episodes, total num timesteps 151200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 756/10000 episodes, total num timesteps 151400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 757/10000 episodes, total num timesteps 151600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 758/10000 episodes, total num timesteps 151800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 759/10000 episodes, total num timesteps 152000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 760/10000 episodes, total num timesteps 152200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 761/10000 episodes, total num timesteps 152400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 762/10000 episodes, total num timesteps 152600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 763/10000 episodes, total num timesteps 152800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 764/10000 episodes, total num timesteps 153000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 765/10000 episodes, total num timesteps 153200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 766/10000 episodes, total num timesteps 153400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 767/10000 episodes, total num timesteps 153600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 768/10000 episodes, total num timesteps 153800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 769/10000 episodes, total num timesteps 154000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 770/10000 episodes, total num timesteps 154200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 771/10000 episodes, total num timesteps 154400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 772/10000 episodes, total num timesteps 154600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 773/10000 episodes, total num timesteps 154800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 774/10000 episodes, total num timesteps 155000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 775/10000 episodes, total num timesteps 155200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.17099330385811093
team_policy eval average team episode rewards of agent0: 35.0
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent1: 0.29928451825593055
team_policy eval average team episode rewards of agent1: 35.0
team_policy eval idv catch total num of agent1: 14
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent2: 0.36874873197063635
team_policy eval average team episode rewards of agent2: 35.0
team_policy eval idv catch total num of agent2: 17
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent3: 0.06775757599807833
team_policy eval average team episode rewards of agent3: 35.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent4: 0.2736433697373218
team_policy eval average team episode rewards of agent4: 35.0
team_policy eval idv catch total num of agent4: 13
team_policy eval team catch total num: 14
idv_policy eval average step individual rewards of agent0: 0.14123245664960185
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 8
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.19997980452743264
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: 0.04270908470657449
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 4
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.11905255893208988
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.19298504809765574
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 10
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 776/10000 episodes, total num timesteps 155400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 777/10000 episodes, total num timesteps 155600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 778/10000 episodes, total num timesteps 155800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 779/10000 episodes, total num timesteps 156000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 780/10000 episodes, total num timesteps 156200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 781/10000 episodes, total num timesteps 156400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 782/10000 episodes, total num timesteps 156600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 783/10000 episodes, total num timesteps 156800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 784/10000 episodes, total num timesteps 157000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 785/10000 episodes, total num timesteps 157200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 786/10000 episodes, total num timesteps 157400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 787/10000 episodes, total num timesteps 157600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 788/10000 episodes, total num timesteps 157800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 789/10000 episodes, total num timesteps 158000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 790/10000 episodes, total num timesteps 158200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 791/10000 episodes, total num timesteps 158400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 792/10000 episodes, total num timesteps 158600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 793/10000 episodes, total num timesteps 158800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 794/10000 episodes, total num timesteps 159000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 795/10000 episodes, total num timesteps 159200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 796/10000 episodes, total num timesteps 159400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 797/10000 episodes, total num timesteps 159600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 798/10000 episodes, total num timesteps 159800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 799/10000 episodes, total num timesteps 160000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 800/10000 episodes, total num timesteps 160200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.17059457356785365
team_policy eval average team episode rewards of agent0: 27.5
team_policy eval idv catch total num of agent0: 9
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent1: 0.2094069549411861
team_policy eval average team episode rewards of agent1: 27.5
team_policy eval idv catch total num of agent1: 11
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent2: 0.22094549464833768
team_policy eval average team episode rewards of agent2: 27.5
team_policy eval idv catch total num of agent2: 11
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent3: 0.1899806505760691
team_policy eval average team episode rewards of agent3: 27.5
team_policy eval idv catch total num of agent3: 10
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent4: 0.037539966112466905
team_policy eval average team episode rewards of agent4: 27.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent0: 0.23961846925736607
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 12
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.21579187242074624
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 11
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: 0.08720644781858704
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.11125690871941296
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.08681172949414633
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 801/10000 episodes, total num timesteps 160400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 802/10000 episodes, total num timesteps 160600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 803/10000 episodes, total num timesteps 160800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 804/10000 episodes, total num timesteps 161000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 805/10000 episodes, total num timesteps 161200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 806/10000 episodes, total num timesteps 161400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 807/10000 episodes, total num timesteps 161600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 808/10000 episodes, total num timesteps 161800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 809/10000 episodes, total num timesteps 162000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 810/10000 episodes, total num timesteps 162200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 811/10000 episodes, total num timesteps 162400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 812/10000 episodes, total num timesteps 162600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 813/10000 episodes, total num timesteps 162800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 814/10000 episodes, total num timesteps 163000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 815/10000 episodes, total num timesteps 163200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 816/10000 episodes, total num timesteps 163400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 817/10000 episodes, total num timesteps 163600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 818/10000 episodes, total num timesteps 163800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 819/10000 episodes, total num timesteps 164000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 820/10000 episodes, total num timesteps 164200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 821/10000 episodes, total num timesteps 164400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 822/10000 episodes, total num timesteps 164600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 823/10000 episodes, total num timesteps 164800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 824/10000 episodes, total num timesteps 165000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 825/10000 episodes, total num timesteps 165200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.010777149841498397
team_policy eval average team episode rewards of agent0: 0.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent1: 0.043846592347093025
team_policy eval average team episode rewards of agent1: 0.0
team_policy eval idv catch total num of agent1: 4
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent2: -0.012946426369157446
team_policy eval average team episode rewards of agent2: 0.0
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent3: -0.006600820375429413
team_policy eval average team episode rewards of agent3: 0.0
team_policy eval idv catch total num of agent3: 2
team_policy eval team catch total num: 0
team_policy eval average step individual rewards of agent4: 0.08659028031893796
team_policy eval average team episode rewards of agent4: 0.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 0
idv_policy eval average step individual rewards of agent0: 0.1658846942161873
idv_policy eval average team episode rewards of agent0: 25.0
idv_policy eval idv catch total num of agent0: 9
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent1: 0.16667221442356428
idv_policy eval average team episode rewards of agent1: 25.0
idv_policy eval idv catch total num of agent1: 9
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent2: 0.1406672948352531
idv_policy eval average team episode rewards of agent2: 25.0
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent3: 0.2915095057557874
idv_policy eval average team episode rewards of agent3: 25.0
idv_policy eval idv catch total num of agent3: 14
idv_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent4: 0.06475094867659971
idv_policy eval average team episode rewards of agent4: 25.0
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 10

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 826/10000 episodes, total num timesteps 165400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 827/10000 episodes, total num timesteps 165600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 828/10000 episodes, total num timesteps 165800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 829/10000 episodes, total num timesteps 166000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 830/10000 episodes, total num timesteps 166200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 831/10000 episodes, total num timesteps 166400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 832/10000 episodes, total num timesteps 166600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 833/10000 episodes, total num timesteps 166800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 834/10000 episodes, total num timesteps 167000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 835/10000 episodes, total num timesteps 167200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 836/10000 episodes, total num timesteps 167400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 837/10000 episodes, total num timesteps 167600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 838/10000 episodes, total num timesteps 167800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 839/10000 episodes, total num timesteps 168000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 840/10000 episodes, total num timesteps 168200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 841/10000 episodes, total num timesteps 168400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 842/10000 episodes, total num timesteps 168600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 843/10000 episodes, total num timesteps 168800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 844/10000 episodes, total num timesteps 169000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 845/10000 episodes, total num timesteps 169200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 846/10000 episodes, total num timesteps 169400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 847/10000 episodes, total num timesteps 169600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 848/10000 episodes, total num timesteps 169800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 849/10000 episodes, total num timesteps 170000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 850/10000 episodes, total num timesteps 170200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.015807688516789462
team_policy eval average team episode rewards of agent0: 10.0
team_policy eval idv catch total num of agent0: 3
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent1: 0.14270101936433574
team_policy eval average team episode rewards of agent1: 10.0
team_policy eval idv catch total num of agent1: 8
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent2: 0.04615425720496129
team_policy eval average team episode rewards of agent2: 10.0
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent3: -0.06017151287805158
team_policy eval average team episode rewards of agent3: 10.0
team_policy eval idv catch total num of agent3: 0
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent4: 0.1210634293492846
team_policy eval average team episode rewards of agent4: 10.0
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent0: 0.036262471815294496
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: 0.03628851109395698
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: 0.13069215948693697
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: 0.03536672644402139
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: -0.01627066547675886
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 2
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 851/10000 episodes, total num timesteps 170400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 852/10000 episodes, total num timesteps 170600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 853/10000 episodes, total num timesteps 170800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 854/10000 episodes, total num timesteps 171000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 855/10000 episodes, total num timesteps 171200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 856/10000 episodes, total num timesteps 171400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 857/10000 episodes, total num timesteps 171600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 858/10000 episodes, total num timesteps 171800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 859/10000 episodes, total num timesteps 172000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 860/10000 episodes, total num timesteps 172200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 861/10000 episodes, total num timesteps 172400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 862/10000 episodes, total num timesteps 172600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 863/10000 episodes, total num timesteps 172800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 864/10000 episodes, total num timesteps 173000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 865/10000 episodes, total num timesteps 173200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 866/10000 episodes, total num timesteps 173400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 867/10000 episodes, total num timesteps 173600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 868/10000 episodes, total num timesteps 173800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 869/10000 episodes, total num timesteps 174000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 870/10000 episodes, total num timesteps 174200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 871/10000 episodes, total num timesteps 174400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 872/10000 episodes, total num timesteps 174600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 873/10000 episodes, total num timesteps 174800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 874/10000 episodes, total num timesteps 175000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 875/10000 episodes, total num timesteps 175200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.1954725740014004
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 10
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.1203165091901187
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.09468311604651394
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.1688130159737252
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 9
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.1193436445734812
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 7
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.057431036113994784
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.1353504879281681
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.0590577245522948
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.13848208731735473
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 8
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.028916131121640223
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 876/10000 episodes, total num timesteps 175400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 877/10000 episodes, total num timesteps 175600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 878/10000 episodes, total num timesteps 175800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 879/10000 episodes, total num timesteps 176000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 880/10000 episodes, total num timesteps 176200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 881/10000 episodes, total num timesteps 176400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 882/10000 episodes, total num timesteps 176600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 883/10000 episodes, total num timesteps 176800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 884/10000 episodes, total num timesteps 177000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 885/10000 episodes, total num timesteps 177200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 886/10000 episodes, total num timesteps 177400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 887/10000 episodes, total num timesteps 177600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 888/10000 episodes, total num timesteps 177800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 889/10000 episodes, total num timesteps 178000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 890/10000 episodes, total num timesteps 178200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 891/10000 episodes, total num timesteps 178400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 892/10000 episodes, total num timesteps 178600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 893/10000 episodes, total num timesteps 178800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 894/10000 episodes, total num timesteps 179000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 895/10000 episodes, total num timesteps 179200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 896/10000 episodes, total num timesteps 179400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 897/10000 episodes, total num timesteps 179600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 898/10000 episodes, total num timesteps 179800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 899/10000 episodes, total num timesteps 180000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 900/10000 episodes, total num timesteps 180200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.09477225680214234
team_policy eval average team episode rewards of agent0: 25.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent1: 0.19010004827747423
team_policy eval average team episode rewards of agent1: 25.0
team_policy eval idv catch total num of agent1: 10
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent2: 0.2660371441886853
team_policy eval average team episode rewards of agent2: 25.0
team_policy eval idv catch total num of agent2: 13
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent3: 0.0717520296298787
team_policy eval average team episode rewards of agent3: 25.0
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 10
team_policy eval average step individual rewards of agent4: 0.14168322180792828
team_policy eval average team episode rewards of agent4: 25.0
team_policy eval idv catch total num of agent4: 8
team_policy eval team catch total num: 10
idv_policy eval average step individual rewards of agent0: 0.09461984498866823
idv_policy eval average team episode rewards of agent0: 20.0
idv_policy eval idv catch total num of agent0: 6
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent1: 0.018866761130125257
idv_policy eval average team episode rewards of agent1: 20.0
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent2: 0.145478217488574
idv_policy eval average team episode rewards of agent2: 20.0
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent3: 0.2224724269053639
idv_policy eval average team episode rewards of agent3: 20.0
idv_policy eval idv catch total num of agent3: 11
idv_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent4: 0.1220166535797215
idv_policy eval average team episode rewards of agent4: 20.0
idv_policy eval idv catch total num of agent4: 7
idv_policy eval team catch total num: 8

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 901/10000 episodes, total num timesteps 180400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 902/10000 episodes, total num timesteps 180600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 903/10000 episodes, total num timesteps 180800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 904/10000 episodes, total num timesteps 181000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 905/10000 episodes, total num timesteps 181200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 906/10000 episodes, total num timesteps 181400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 907/10000 episodes, total num timesteps 181600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 908/10000 episodes, total num timesteps 181800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 909/10000 episodes, total num timesteps 182000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 910/10000 episodes, total num timesteps 182200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 911/10000 episodes, total num timesteps 182400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 912/10000 episodes, total num timesteps 182600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 913/10000 episodes, total num timesteps 182800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 914/10000 episodes, total num timesteps 183000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 915/10000 episodes, total num timesteps 183200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 916/10000 episodes, total num timesteps 183400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 917/10000 episodes, total num timesteps 183600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 918/10000 episodes, total num timesteps 183800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 919/10000 episodes, total num timesteps 184000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 920/10000 episodes, total num timesteps 184200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 921/10000 episodes, total num timesteps 184400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 922/10000 episodes, total num timesteps 184600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 923/10000 episodes, total num timesteps 184800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 924/10000 episodes, total num timesteps 185000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 925/10000 episodes, total num timesteps 185200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.09440538517690822
team_policy eval average team episode rewards of agent0: 10.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent1: 0.039487320235309833
team_policy eval average team episode rewards of agent1: 10.0
team_policy eval idv catch total num of agent1: 4
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent2: 0.11705600856335457
team_policy eval average team episode rewards of agent2: 10.0
team_policy eval idv catch total num of agent2: 7
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent3: 0.00959553815356625
team_policy eval average team episode rewards of agent3: 10.0
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 4
team_policy eval average step individual rewards of agent4: 0.22649318540365326
team_policy eval average team episode rewards of agent4: 10.0
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent0: 0.09482183851499347
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.11999780255994566
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 7
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.03993402772885908
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 4
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.010178391218455962
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: -0.011155011683286871
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 2
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 926/10000 episodes, total num timesteps 185400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 927/10000 episodes, total num timesteps 185600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 928/10000 episodes, total num timesteps 185800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 929/10000 episodes, total num timesteps 186000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 930/10000 episodes, total num timesteps 186200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 931/10000 episodes, total num timesteps 186400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 932/10000 episodes, total num timesteps 186600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 933/10000 episodes, total num timesteps 186800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 934/10000 episodes, total num timesteps 187000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 935/10000 episodes, total num timesteps 187200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 936/10000 episodes, total num timesteps 187400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 937/10000 episodes, total num timesteps 187600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 938/10000 episodes, total num timesteps 187800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 939/10000 episodes, total num timesteps 188000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 940/10000 episodes, total num timesteps 188200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 941/10000 episodes, total num timesteps 188400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 942/10000 episodes, total num timesteps 188600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 943/10000 episodes, total num timesteps 188800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 944/10000 episodes, total num timesteps 189000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 945/10000 episodes, total num timesteps 189200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 946/10000 episodes, total num timesteps 189400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 947/10000 episodes, total num timesteps 189600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 948/10000 episodes, total num timesteps 189800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 949/10000 episodes, total num timesteps 190000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 950/10000 episodes, total num timesteps 190200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.2699168178300587
team_policy eval average team episode rewards of agent0: 35.0
team_policy eval idv catch total num of agent0: 13
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent1: 0.039928860773808424
team_policy eval average team episode rewards of agent1: 35.0
team_policy eval idv catch total num of agent1: 4
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent2: 0.19026011421665334
team_policy eval average team episode rewards of agent2: 35.0
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent3: 0.2954615138414589
team_policy eval average team episode rewards of agent3: 35.0
team_policy eval idv catch total num of agent3: 14
team_policy eval team catch total num: 14
team_policy eval average step individual rewards of agent4: 0.34084554195477074
team_policy eval average team episode rewards of agent4: 35.0
team_policy eval idv catch total num of agent4: 16
team_policy eval team catch total num: 14
idv_policy eval average step individual rewards of agent0: 0.29049059538196437
idv_policy eval average team episode rewards of agent0: 22.5
idv_policy eval idv catch total num of agent0: 14
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent1: 0.01412765216064413
idv_policy eval average team episode rewards of agent1: 22.5
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent2: 0.24420102716918568
idv_policy eval average team episode rewards of agent2: 22.5
idv_policy eval idv catch total num of agent2: 12
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent3: 0.010685478250435762
idv_policy eval average team episode rewards of agent3: 22.5
idv_policy eval idv catch total num of agent3: 3
idv_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent4: 0.015232321776682172
idv_policy eval average team episode rewards of agent4: 22.5
idv_policy eval idv catch total num of agent4: 3
idv_policy eval team catch total num: 9

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 951/10000 episodes, total num timesteps 190400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 952/10000 episodes, total num timesteps 190600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 953/10000 episodes, total num timesteps 190800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 954/10000 episodes, total num timesteps 191000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 955/10000 episodes, total num timesteps 191200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 956/10000 episodes, total num timesteps 191400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 957/10000 episodes, total num timesteps 191600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 958/10000 episodes, total num timesteps 191800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 959/10000 episodes, total num timesteps 192000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 960/10000 episodes, total num timesteps 192200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 961/10000 episodes, total num timesteps 192400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 962/10000 episodes, total num timesteps 192600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 963/10000 episodes, total num timesteps 192800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 964/10000 episodes, total num timesteps 193000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 965/10000 episodes, total num timesteps 193200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 966/10000 episodes, total num timesteps 193400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 967/10000 episodes, total num timesteps 193600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 968/10000 episodes, total num timesteps 193800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 969/10000 episodes, total num timesteps 194000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 970/10000 episodes, total num timesteps 194200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 971/10000 episodes, total num timesteps 194400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 972/10000 episodes, total num timesteps 194600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 973/10000 episodes, total num timesteps 194800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 974/10000 episodes, total num timesteps 195000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 975/10000 episodes, total num timesteps 195200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.09303260428601931
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.19322472719631997
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 10
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.039556776133738786
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 4
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.092475539592492
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.09614636676775795
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.06016179498015707
idv_policy eval average team episode rewards of agent0: 10.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent1: 0.033503321124295826
idv_policy eval average team episode rewards of agent1: 10.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent2: 0.007055698082637165
idv_policy eval average team episode rewards of agent2: 10.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent3: 0.10802300248720602
idv_policy eval average team episode rewards of agent3: 10.0
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 4
idv_policy eval average step individual rewards of agent4: 0.03253113993741862
idv_policy eval average team episode rewards of agent4: 10.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 4

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 976/10000 episodes, total num timesteps 195400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 977/10000 episodes, total num timesteps 195600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 978/10000 episodes, total num timesteps 195800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 979/10000 episodes, total num timesteps 196000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 980/10000 episodes, total num timesteps 196200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 981/10000 episodes, total num timesteps 196400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 982/10000 episodes, total num timesteps 196600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 983/10000 episodes, total num timesteps 196800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 984/10000 episodes, total num timesteps 197000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 985/10000 episodes, total num timesteps 197200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 986/10000 episodes, total num timesteps 197400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 987/10000 episodes, total num timesteps 197600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 988/10000 episodes, total num timesteps 197800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 989/10000 episodes, total num timesteps 198000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 990/10000 episodes, total num timesteps 198200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 991/10000 episodes, total num timesteps 198400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 992/10000 episodes, total num timesteps 198600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 993/10000 episodes, total num timesteps 198800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 994/10000 episodes, total num timesteps 199000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 995/10000 episodes, total num timesteps 199200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 996/10000 episodes, total num timesteps 199400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 997/10000 episodes, total num timesteps 199600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 998/10000 episodes, total num timesteps 199800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 999/10000 episodes, total num timesteps 200000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1000/10000 episodes, total num timesteps 200200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.01979924567872806
team_policy eval average team episode rewards of agent0: 20.0
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent1: 0.09246865391051486
team_policy eval average team episode rewards of agent1: 20.0
team_policy eval idv catch total num of agent1: 6
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent2: 0.14124228251980697
team_policy eval average team episode rewards of agent2: 20.0
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent3: 0.0358060209580613
team_policy eval average team episode rewards of agent3: 20.0
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 8
team_policy eval average step individual rewards of agent4: 0.1906127055124886
team_policy eval average team episode rewards of agent4: 20.0
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 8
idv_policy eval average step individual rewards of agent0: 0.0710512981779723
idv_policy eval average team episode rewards of agent0: 17.5
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent1: 0.04147004812620389
idv_policy eval average team episode rewards of agent1: 17.5
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent2: 0.017402509253527425
idv_policy eval average team episode rewards of agent2: 17.5
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent3: 0.12191823370608493
idv_policy eval average team episode rewards of agent3: 17.5
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 7
idv_policy eval average step individual rewards of agent4: 0.08752181381951468
idv_policy eval average team episode rewards of agent4: 17.5
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 7

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1001/10000 episodes, total num timesteps 200400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1002/10000 episodes, total num timesteps 200600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1003/10000 episodes, total num timesteps 200800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1004/10000 episodes, total num timesteps 201000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1005/10000 episodes, total num timesteps 201200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1006/10000 episodes, total num timesteps 201400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1007/10000 episodes, total num timesteps 201600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1008/10000 episodes, total num timesteps 201800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1009/10000 episodes, total num timesteps 202000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1010/10000 episodes, total num timesteps 202200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1011/10000 episodes, total num timesteps 202400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1012/10000 episodes, total num timesteps 202600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1013/10000 episodes, total num timesteps 202800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1014/10000 episodes, total num timesteps 203000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1015/10000 episodes, total num timesteps 203200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1016/10000 episodes, total num timesteps 203400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1017/10000 episodes, total num timesteps 203600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1018/10000 episodes, total num timesteps 203800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1019/10000 episodes, total num timesteps 204000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1020/10000 episodes, total num timesteps 204200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1021/10000 episodes, total num timesteps 204400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1022/10000 episodes, total num timesteps 204600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1023/10000 episodes, total num timesteps 204800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1024/10000 episodes, total num timesteps 205000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1025/10000 episodes, total num timesteps 205200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.09395331564927858
team_policy eval average team episode rewards of agent0: 32.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 13
team_policy eval average step individual rewards of agent1: 0.27646269025464215
team_policy eval average team episode rewards of agent1: 32.5
team_policy eval idv catch total num of agent1: 13
team_policy eval team catch total num: 13
team_policy eval average step individual rewards of agent2: 0.19907448453509488
team_policy eval average team episode rewards of agent2: 32.5
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 13
team_policy eval average step individual rewards of agent3: 0.09645806026180857
team_policy eval average team episode rewards of agent3: 32.5
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 13
team_policy eval average step individual rewards of agent4: 0.3284540285373626
team_policy eval average team episode rewards of agent4: 32.5
team_policy eval idv catch total num of agent4: 15
team_policy eval team catch total num: 13
idv_policy eval average step individual rewards of agent0: 0.14173433776814431
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 8
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.11468452391540537
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 7
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.06370571425994308
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.031040806966765172
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: -0.04954350908711901
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 1
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1026/10000 episodes, total num timesteps 205400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1027/10000 episodes, total num timesteps 205600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1028/10000 episodes, total num timesteps 205800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1029/10000 episodes, total num timesteps 206000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1030/10000 episodes, total num timesteps 206200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1031/10000 episodes, total num timesteps 206400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1032/10000 episodes, total num timesteps 206600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1033/10000 episodes, total num timesteps 206800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1034/10000 episodes, total num timesteps 207000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1035/10000 episodes, total num timesteps 207200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1036/10000 episodes, total num timesteps 207400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1037/10000 episodes, total num timesteps 207600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1038/10000 episodes, total num timesteps 207800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1039/10000 episodes, total num timesteps 208000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1040/10000 episodes, total num timesteps 208200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1041/10000 episodes, total num timesteps 208400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1042/10000 episodes, total num timesteps 208600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1043/10000 episodes, total num timesteps 208800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1044/10000 episodes, total num timesteps 209000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1045/10000 episodes, total num timesteps 209200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1046/10000 episodes, total num timesteps 209400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1047/10000 episodes, total num timesteps 209600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1048/10000 episodes, total num timesteps 209800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1049/10000 episodes, total num timesteps 210000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1050/10000 episodes, total num timesteps 210200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08738086628761113
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: -0.014622244677176528
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.13240388844608902
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.13749846014146053
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.13890185008980846
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 8
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.18555744216875467
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 10
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: 0.033465439221750694
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: 0.06318349753209762
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: -0.011303182355521862
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: -0.02429224945852978
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 2
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1051/10000 episodes, total num timesteps 210400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1052/10000 episodes, total num timesteps 210600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1053/10000 episodes, total num timesteps 210800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1054/10000 episodes, total num timesteps 211000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1055/10000 episodes, total num timesteps 211200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1056/10000 episodes, total num timesteps 211400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1057/10000 episodes, total num timesteps 211600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1058/10000 episodes, total num timesteps 211800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1059/10000 episodes, total num timesteps 212000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1060/10000 episodes, total num timesteps 212200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1061/10000 episodes, total num timesteps 212400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1062/10000 episodes, total num timesteps 212600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1063/10000 episodes, total num timesteps 212800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1064/10000 episodes, total num timesteps 213000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1065/10000 episodes, total num timesteps 213200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1066/10000 episodes, total num timesteps 213400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1067/10000 episodes, total num timesteps 213600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1068/10000 episodes, total num timesteps 213800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1069/10000 episodes, total num timesteps 214000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1070/10000 episodes, total num timesteps 214200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1071/10000 episodes, total num timesteps 214400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1072/10000 episodes, total num timesteps 214600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1073/10000 episodes, total num timesteps 214800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1074/10000 episodes, total num timesteps 215000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1075/10000 episodes, total num timesteps 215200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.09783711270914354
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.10236490804430166
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.06730020253174523
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.0995827826694326
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: -0.03129380657544534
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 1
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.02218466939611323
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.039503435449731825
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.11529806414414118
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 7
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.11490779338638052
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.08543239961245039
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1076/10000 episodes, total num timesteps 215400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1077/10000 episodes, total num timesteps 215600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1078/10000 episodes, total num timesteps 215800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1079/10000 episodes, total num timesteps 216000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1080/10000 episodes, total num timesteps 216200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1081/10000 episodes, total num timesteps 216400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1082/10000 episodes, total num timesteps 216600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1083/10000 episodes, total num timesteps 216800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1084/10000 episodes, total num timesteps 217000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1085/10000 episodes, total num timesteps 217200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1086/10000 episodes, total num timesteps 217400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1087/10000 episodes, total num timesteps 217600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1088/10000 episodes, total num timesteps 217800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1089/10000 episodes, total num timesteps 218000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1090/10000 episodes, total num timesteps 218200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1091/10000 episodes, total num timesteps 218400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1092/10000 episodes, total num timesteps 218600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1093/10000 episodes, total num timesteps 218800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1094/10000 episodes, total num timesteps 219000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1095/10000 episodes, total num timesteps 219200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1096/10000 episodes, total num timesteps 219400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1097/10000 episodes, total num timesteps 219600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1098/10000 episodes, total num timesteps 219800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1099/10000 episodes, total num timesteps 220000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1100/10000 episodes, total num timesteps 220200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08572526808956626
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: -0.038642883765010946
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 1
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.007516657007255721
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 3
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.03448088458614885
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.08617998760974874
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.09918245896231422
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.043846405228601774
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.02056733664081251
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.1718193048726379
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 9
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.011870118042419567
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 3
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1101/10000 episodes, total num timesteps 220400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1102/10000 episodes, total num timesteps 220600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1103/10000 episodes, total num timesteps 220800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1104/10000 episodes, total num timesteps 221000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1105/10000 episodes, total num timesteps 221200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1106/10000 episodes, total num timesteps 221400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1107/10000 episodes, total num timesteps 221600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1108/10000 episodes, total num timesteps 221800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1109/10000 episodes, total num timesteps 222000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1110/10000 episodes, total num timesteps 222200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1111/10000 episodes, total num timesteps 222400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1112/10000 episodes, total num timesteps 222600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1113/10000 episodes, total num timesteps 222800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1114/10000 episodes, total num timesteps 223000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1115/10000 episodes, total num timesteps 223200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1116/10000 episodes, total num timesteps 223400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1117/10000 episodes, total num timesteps 223600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1118/10000 episodes, total num timesteps 223800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1119/10000 episodes, total num timesteps 224000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1120/10000 episodes, total num timesteps 224200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1121/10000 episodes, total num timesteps 224400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1122/10000 episodes, total num timesteps 224600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1123/10000 episodes, total num timesteps 224800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1124/10000 episodes, total num timesteps 225000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1125/10000 episodes, total num timesteps 225200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.24687289716284905
team_policy eval average team episode rewards of agent0: 40.0
team_policy eval idv catch total num of agent0: 12
team_policy eval team catch total num: 16
team_policy eval average step individual rewards of agent1: 0.24162278459319161
team_policy eval average team episode rewards of agent1: 40.0
team_policy eval idv catch total num of agent1: 12
team_policy eval team catch total num: 16
team_policy eval average step individual rewards of agent2: 0.19138027841792055
team_policy eval average team episode rewards of agent2: 40.0
team_policy eval idv catch total num of agent2: 10
team_policy eval team catch total num: 16
team_policy eval average step individual rewards of agent3: 0.16334600436923397
team_policy eval average team episode rewards of agent3: 40.0
team_policy eval idv catch total num of agent3: 9
team_policy eval team catch total num: 16
team_policy eval average step individual rewards of agent4: 0.19974649680805143
team_policy eval average team episode rewards of agent4: 40.0
team_policy eval idv catch total num of agent4: 10
team_policy eval team catch total num: 16
idv_policy eval average step individual rewards of agent0: 0.013072976898822248
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.03337759048600452
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.011705140950142256
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.06328709098432513
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 5
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.061091722529102596
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1126/10000 episodes, total num timesteps 225400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1127/10000 episodes, total num timesteps 225600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1128/10000 episodes, total num timesteps 225800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1129/10000 episodes, total num timesteps 226000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1130/10000 episodes, total num timesteps 226200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1131/10000 episodes, total num timesteps 226400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1132/10000 episodes, total num timesteps 226600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1133/10000 episodes, total num timesteps 226800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1134/10000 episodes, total num timesteps 227000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1135/10000 episodes, total num timesteps 227200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1136/10000 episodes, total num timesteps 227400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1137/10000 episodes, total num timesteps 227600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1138/10000 episodes, total num timesteps 227800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1139/10000 episodes, total num timesteps 228000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1140/10000 episodes, total num timesteps 228200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1141/10000 episodes, total num timesteps 228400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1142/10000 episodes, total num timesteps 228600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1143/10000 episodes, total num timesteps 228800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1144/10000 episodes, total num timesteps 229000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1145/10000 episodes, total num timesteps 229200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1146/10000 episodes, total num timesteps 229400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1147/10000 episodes, total num timesteps 229600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1148/10000 episodes, total num timesteps 229800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1149/10000 episodes, total num timesteps 230000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1150/10000 episodes, total num timesteps 230200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.08752650631943172
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.1117209511369328
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 7
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.08421487352933016
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.05998891381299349
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.16161266468907207
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 9
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.03713346881247937
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.033398440332020204
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.007346000748081236
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 3
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.08143344949014276
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.08508897597860822
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1151/10000 episodes, total num timesteps 230400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1152/10000 episodes, total num timesteps 230600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1153/10000 episodes, total num timesteps 230800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1154/10000 episodes, total num timesteps 231000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1155/10000 episodes, total num timesteps 231200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1156/10000 episodes, total num timesteps 231400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1157/10000 episodes, total num timesteps 231600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1158/10000 episodes, total num timesteps 231800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1159/10000 episodes, total num timesteps 232000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1160/10000 episodes, total num timesteps 232200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1161/10000 episodes, total num timesteps 232400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1162/10000 episodes, total num timesteps 232600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1163/10000 episodes, total num timesteps 232800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1164/10000 episodes, total num timesteps 233000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1165/10000 episodes, total num timesteps 233200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1166/10000 episodes, total num timesteps 233400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1167/10000 episodes, total num timesteps 233600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1168/10000 episodes, total num timesteps 233800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1169/10000 episodes, total num timesteps 234000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1170/10000 episodes, total num timesteps 234200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1171/10000 episodes, total num timesteps 234400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1172/10000 episodes, total num timesteps 234600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1173/10000 episodes, total num timesteps 234800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1174/10000 episodes, total num timesteps 235000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1175/10000 episodes, total num timesteps 235200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.04239189825978216
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 4
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.06576039432992432
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: -0.008803502295959728
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 2
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.1463286091960539
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.08676934411189727
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 6
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.05867382191228193
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.18515613404743048
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 10
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.0334822919744262
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 4
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.08051645339972525
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.0033634774633090813
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 3
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1176/10000 episodes, total num timesteps 235400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1177/10000 episodes, total num timesteps 235600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1178/10000 episodes, total num timesteps 235800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1179/10000 episodes, total num timesteps 236000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1180/10000 episodes, total num timesteps 236200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1181/10000 episodes, total num timesteps 236400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1182/10000 episodes, total num timesteps 236600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1183/10000 episodes, total num timesteps 236800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1184/10000 episodes, total num timesteps 237000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1185/10000 episodes, total num timesteps 237200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1186/10000 episodes, total num timesteps 237400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1187/10000 episodes, total num timesteps 237600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1188/10000 episodes, total num timesteps 237800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1189/10000 episodes, total num timesteps 238000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1190/10000 episodes, total num timesteps 238200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1191/10000 episodes, total num timesteps 238400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1192/10000 episodes, total num timesteps 238600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1193/10000 episodes, total num timesteps 238800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1194/10000 episodes, total num timesteps 239000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1195/10000 episodes, total num timesteps 239200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1196/10000 episodes, total num timesteps 239400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1197/10000 episodes, total num timesteps 239600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1198/10000 episodes, total num timesteps 239800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1199/10000 episodes, total num timesteps 240000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1200/10000 episodes, total num timesteps 240200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.053111674873247285
team_policy eval average team episode rewards of agent0: 12.5
team_policy eval idv catch total num of agent0: 5
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent1: 0.008428105190456137
team_policy eval average team episode rewards of agent1: 12.5
team_policy eval idv catch total num of agent1: 3
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent2: 0.08941294488613317
team_policy eval average team episode rewards of agent2: 12.5
team_policy eval idv catch total num of agent2: 6
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent3: 0.03488025776017006
team_policy eval average team episode rewards of agent3: 12.5
team_policy eval idv catch total num of agent3: 4
team_policy eval team catch total num: 5
team_policy eval average step individual rewards of agent4: 0.03766311447728403
team_policy eval average team episode rewards of agent4: 12.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent0: 0.36727412682319666
idv_policy eval average team episode rewards of agent0: 27.5
idv_policy eval idv catch total num of agent0: 17
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent1: 0.034612190102940005
idv_policy eval average team episode rewards of agent1: 27.5
idv_policy eval idv catch total num of agent1: 4
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent2: 0.060139859559947446
idv_policy eval average team episode rewards of agent2: 27.5
idv_policy eval idv catch total num of agent2: 5
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent3: 0.11354287171566341
idv_policy eval average team episode rewards of agent3: 27.5
idv_policy eval idv catch total num of agent3: 7
idv_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent4: 0.14165829663228136
idv_policy eval average team episode rewards of agent4: 27.5
idv_policy eval idv catch total num of agent4: 8
idv_policy eval team catch total num: 11

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1201/10000 episodes, total num timesteps 240400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1202/10000 episodes, total num timesteps 240600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1203/10000 episodes, total num timesteps 240800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1204/10000 episodes, total num timesteps 241000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1205/10000 episodes, total num timesteps 241200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1206/10000 episodes, total num timesteps 241400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1207/10000 episodes, total num timesteps 241600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1208/10000 episodes, total num timesteps 241800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1209/10000 episodes, total num timesteps 242000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1210/10000 episodes, total num timesteps 242200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1211/10000 episodes, total num timesteps 242400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1212/10000 episodes, total num timesteps 242600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1213/10000 episodes, total num timesteps 242800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1214/10000 episodes, total num timesteps 243000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1215/10000 episodes, total num timesteps 243200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1216/10000 episodes, total num timesteps 243400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1217/10000 episodes, total num timesteps 243600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1218/10000 episodes, total num timesteps 243800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1219/10000 episodes, total num timesteps 244000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1220/10000 episodes, total num timesteps 244200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1221/10000 episodes, total num timesteps 244400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1222/10000 episodes, total num timesteps 244600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1223/10000 episodes, total num timesteps 244800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1224/10000 episodes, total num timesteps 245000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1225/10000 episodes, total num timesteps 245200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.2614050352487127
team_policy eval average team episode rewards of agent0: 22.5
team_policy eval idv catch total num of agent0: 13
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent1: 0.19307757799854713
team_policy eval average team episode rewards of agent1: 22.5
team_policy eval idv catch total num of agent1: 10
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent2: 0.063835789631744
team_policy eval average team episode rewards of agent2: 22.5
team_policy eval idv catch total num of agent2: 5
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent3: 0.2372494163532314
team_policy eval average team episode rewards of agent3: 22.5
team_policy eval idv catch total num of agent3: 12
team_policy eval team catch total num: 9
team_policy eval average step individual rewards of agent4: 0.16846503430096454
team_policy eval average team episode rewards of agent4: 22.5
team_policy eval idv catch total num of agent4: 9
team_policy eval team catch total num: 9
idv_policy eval average step individual rewards of agent0: 0.03689142732355135
idv_policy eval average team episode rewards of agent0: 7.5
idv_policy eval idv catch total num of agent0: 4
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent1: -0.003357337613077609
idv_policy eval average team episode rewards of agent1: 7.5
idv_policy eval idv catch total num of agent1: 3
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent2: 0.13144751418860617
idv_policy eval average team episode rewards of agent2: 7.5
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent3: 0.03224260987560402
idv_policy eval average team episode rewards of agent3: 7.5
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 3
idv_policy eval average step individual rewards of agent4: -0.07188192528787786
idv_policy eval average team episode rewards of agent4: 7.5
idv_policy eval idv catch total num of agent4: 0
idv_policy eval team catch total num: 3

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1226/10000 episodes, total num timesteps 245400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1227/10000 episodes, total num timesteps 245600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1228/10000 episodes, total num timesteps 245800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1229/10000 episodes, total num timesteps 246000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1230/10000 episodes, total num timesteps 246200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1231/10000 episodes, total num timesteps 246400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1232/10000 episodes, total num timesteps 246600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1233/10000 episodes, total num timesteps 246800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1234/10000 episodes, total num timesteps 247000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1235/10000 episodes, total num timesteps 247200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1236/10000 episodes, total num timesteps 247400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1237/10000 episodes, total num timesteps 247600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1238/10000 episodes, total num timesteps 247800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1239/10000 episodes, total num timesteps 248000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1240/10000 episodes, total num timesteps 248200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1241/10000 episodes, total num timesteps 248400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1242/10000 episodes, total num timesteps 248600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1243/10000 episodes, total num timesteps 248800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1244/10000 episodes, total num timesteps 249000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1245/10000 episodes, total num timesteps 249200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1246/10000 episodes, total num timesteps 249400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1247/10000 episodes, total num timesteps 249600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1248/10000 episodes, total num timesteps 249800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1249/10000 episodes, total num timesteps 250000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1250/10000 episodes, total num timesteps 250200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.15564178786413793
team_policy eval average team episode rewards of agent0: 15.0
team_policy eval idv catch total num of agent0: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent1: 0.06896884402867119
team_policy eval average team episode rewards of agent1: 15.0
team_policy eval idv catch total num of agent1: 5
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent2: 0.12269899759175573
team_policy eval average team episode rewards of agent2: 15.0
team_policy eval idv catch total num of agent2: 7
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent3: 0.14699126735822737
team_policy eval average team episode rewards of agent3: 15.0
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 6
team_policy eval average step individual rewards of agent4: 0.054943886060423594
team_policy eval average team episode rewards of agent4: 15.0
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent0: 0.14922008553919663
idv_policy eval average team episode rewards of agent0: 12.5
idv_policy eval idv catch total num of agent0: 8
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent1: 0.14662754485015012
idv_policy eval average team episode rewards of agent1: 12.5
idv_policy eval idv catch total num of agent1: 8
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent2: 0.09909047973976101
idv_policy eval average team episode rewards of agent2: 12.5
idv_policy eval idv catch total num of agent2: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent3: 0.08833777714210606
idv_policy eval average team episode rewards of agent3: 12.5
idv_policy eval idv catch total num of agent3: 6
idv_policy eval team catch total num: 5
idv_policy eval average step individual rewards of agent4: 0.090050013317902
idv_policy eval average team episode rewards of agent4: 12.5
idv_policy eval idv catch total num of agent4: 6
idv_policy eval team catch total num: 5

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1251/10000 episodes, total num timesteps 250400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1252/10000 episodes, total num timesteps 250600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1253/10000 episodes, total num timesteps 250800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1254/10000 episodes, total num timesteps 251000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1255/10000 episodes, total num timesteps 251200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1256/10000 episodes, total num timesteps 251400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1257/10000 episodes, total num timesteps 251600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1258/10000 episodes, total num timesteps 251800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1259/10000 episodes, total num timesteps 252000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1260/10000 episodes, total num timesteps 252200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1261/10000 episodes, total num timesteps 252400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1262/10000 episodes, total num timesteps 252600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1263/10000 episodes, total num timesteps 252800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1264/10000 episodes, total num timesteps 253000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1265/10000 episodes, total num timesteps 253200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1266/10000 episodes, total num timesteps 253400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1267/10000 episodes, total num timesteps 253600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1268/10000 episodes, total num timesteps 253800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1269/10000 episodes, total num timesteps 254000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1270/10000 episodes, total num timesteps 254200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1271/10000 episodes, total num timesteps 254400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1272/10000 episodes, total num timesteps 254600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1273/10000 episodes, total num timesteps 254800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1274/10000 episodes, total num timesteps 255000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1275/10000 episodes, total num timesteps 255200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: 0.24558314865587128
team_policy eval average team episode rewards of agent0: 27.5
team_policy eval idv catch total num of agent0: 12
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent1: 0.17608651721416238
team_policy eval average team episode rewards of agent1: 27.5
team_policy eval idv catch total num of agent1: 9
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent2: 0.24609137209028467
team_policy eval average team episode rewards of agent2: 27.5
team_policy eval idv catch total num of agent2: 12
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent3: 0.14340998261702637
team_policy eval average team episode rewards of agent3: 27.5
team_policy eval idv catch total num of agent3: 8
team_policy eval team catch total num: 11
team_policy eval average step individual rewards of agent4: 0.2230500265750431
team_policy eval average team episode rewards of agent4: 27.5
team_policy eval idv catch total num of agent4: 11
team_policy eval team catch total num: 11
idv_policy eval average step individual rewards of agent0: -0.04126260489588097
idv_policy eval average team episode rewards of agent0: 15.0
idv_policy eval idv catch total num of agent0: 1
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent1: 0.060637806309814676
idv_policy eval average team episode rewards of agent1: 15.0
idv_policy eval idv catch total num of agent1: 5
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent2: 0.1388211598125679
idv_policy eval average team episode rewards of agent2: 15.0
idv_policy eval idv catch total num of agent2: 8
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent3: 0.17061851732337124
idv_policy eval average team episode rewards of agent3: 15.0
idv_policy eval idv catch total num of agent3: 9
idv_policy eval team catch total num: 6
idv_policy eval average step individual rewards of agent4: 0.06767698454357643
idv_policy eval average team episode rewards of agent4: 15.0
idv_policy eval idv catch total num of agent4: 5
idv_policy eval team catch total num: 6

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1276/10000 episodes, total num timesteps 255400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1277/10000 episodes, total num timesteps 255600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1278/10000 episodes, total num timesteps 255800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1279/10000 episodes, total num timesteps 256000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1280/10000 episodes, total num timesteps 256200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1281/10000 episodes, total num timesteps 256400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1282/10000 episodes, total num timesteps 256600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1283/10000 episodes, total num timesteps 256800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1284/10000 episodes, total num timesteps 257000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1285/10000 episodes, total num timesteps 257200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1286/10000 episodes, total num timesteps 257400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1287/10000 episodes, total num timesteps 257600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1288/10000 episodes, total num timesteps 257800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1289/10000 episodes, total num timesteps 258000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1290/10000 episodes, total num timesteps 258200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1291/10000 episodes, total num timesteps 258400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1292/10000 episodes, total num timesteps 258600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1293/10000 episodes, total num timesteps 258800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1294/10000 episodes, total num timesteps 259000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1295/10000 episodes, total num timesteps 259200/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1296/10000 episodes, total num timesteps 259400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1297/10000 episodes, total num timesteps 259600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1298/10000 episodes, total num timesteps 259800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1299/10000 episodes, total num timesteps 260000/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1300/10000 episodes, total num timesteps 260200/2000000, FPS 178.

team_policy eval average step individual rewards of agent0: -0.006712266607537103
team_policy eval average team episode rewards of agent0: 2.5
team_policy eval idv catch total num of agent0: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent1: -0.011003358422339878
team_policy eval average team episode rewards of agent1: 2.5
team_policy eval idv catch total num of agent1: 2
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent2: -0.062277971035944776
team_policy eval average team episode rewards of agent2: 2.5
team_policy eval idv catch total num of agent2: 0
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent3: 0.007441116331547897
team_policy eval average team episode rewards of agent3: 2.5
team_policy eval idv catch total num of agent3: 3
team_policy eval team catch total num: 1
team_policy eval average step individual rewards of agent4: 0.03518702723346283
team_policy eval average team episode rewards of agent4: 2.5
team_policy eval idv catch total num of agent4: 4
team_policy eval team catch total num: 1
idv_policy eval average step individual rewards of agent0: -0.0250963139467841
idv_policy eval average team episode rewards of agent0: 5.0
idv_policy eval idv catch total num of agent0: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent1: -0.018135594417590408
idv_policy eval average team episode rewards of agent1: 5.0
idv_policy eval idv catch total num of agent1: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent2: -0.022566518805439795
idv_policy eval average team episode rewards of agent2: 5.0
idv_policy eval idv catch total num of agent2: 2
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent3: 0.04263683159916022
idv_policy eval average team episode rewards of agent3: 5.0
idv_policy eval idv catch total num of agent3: 4
idv_policy eval team catch total num: 2
idv_policy eval average step individual rewards of agent4: 0.04150653770192681
idv_policy eval average team episode rewards of agent4: 5.0
idv_policy eval idv catch total num of agent4: 4
idv_policy eval team catch total num: 2

 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1301/10000 episodes, total num timesteps 260400/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1302/10000 episodes, total num timesteps 260600/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1303/10000 episodes, total num timesteps 260800/2000000, FPS 178.


 Scenario simple_tag_tr Algo rmappotrsyn Exp exp_train_continue_tag_base_kltcp_s2r2_v1 updates 1304/10000 episodes, total num timesteps 261000/2000000, FPS 178.

