/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
=== specification ====================================================
+: rlrd.training:Training
epochs: 10
rounds: 50
steps: 2000
stats_window: 10000
seed: 0
tag: ''
Env:
   +: rlrd.envs:RandomDelayEnv
   seed_val: 0
   id: Ant-v4
   frame_skip: 0
   min_observation_delay: 0
   sup_observation_delay: 1
   min_action_delay: 0
   sup_action_delay: 1
   real_world_sampler: 5
   action_noise: 0.05
Test:
   +: rlrd.testing:Test
   workers: 1
   number: 1
   device: cpu
Agent:
   +: rlrd.dcac:Agent
   batchsize: 128
   memory_size: 1000000
   lr: 0.0003
   discount: 0.99
   target_update: 0.005
   reward_scale: 5.0
   entropy_scale: 1.0
   start_training: 10000
   device: cpu
   training_steps: 1.0
   loss_alpha: 0.2
   rtac: false
   Model:
      +: rlrd.dcac_models:Mlp
      hidden_units: 256
      num_critics: 2
      act_delay: true
      obs_delay: true
   OutputNorm:
      +: rlrd.nn:PopArt
      beta: 0.0003
      zero_debias: true
      start_pop: 8
__format_version__: '3'
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>

<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 1/10 ===== round 1/50 ======================================
100%|██████████| 2000/2000 [00:02<00:00, 951.37it/s]
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   23
episode_length                      85.565217
returns                            -52.873616
return_std                         110.527961
average_reward                      -0.614997
round_time             0 days 00:00:02.182624
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       948.425858
return_std_test                     16.674578
average_reward_test                  0.948426
round_time_test        0 days 00:00:10.746273
round_time_total       0 days 00:00:12.587188 

=== epoch 1/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
 80%|███████▉  | 1598/2000 [00:01<00:00, 917.57it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [00:02<00:00, 957.44it/s]
episodes                                   37
episode_length                           94.0
returns                            -54.137356
return_std                         126.159917
average_reward                         -0.564
round_time             0 days 00:00:02.594355
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       953.306929
return_std_test                      8.679136
average_reward_test                  0.953307
round_time_test        0 days 00:00:10.949674
round_time_total       0 days 00:00:12.752097 

=== epoch 1/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
 81%|████████  | 1611/2000 [00:01<00:00, 1005.45it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [00:02<00:00, 993.24it/s] 
episodes                                   62
episode_length                      90.467742
returns                            -50.064879
return_std                         117.499434
average_reward                      -0.561748
round_time             0 days 00:00:02.564841
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       954.476906
return_std_test                     15.848591
average_reward_test                  0.954477
round_time_test        0 days 00:00:10.605419
round_time_total       0 days 00:00:12.408922 

=== epoch 1/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
 81%|████████  | 1616/2000 [00:01<00:00, 983.14it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [00:02<00:00, 975.22it/s]
episodes                                   70
episode_length                     112.728571
returns                            -64.438473
return_std                         146.222372
average_reward                       -0.57673
round_time             0 days 00:00:02.588095
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       942.574998
return_std_test                     22.386029
average_reward_test                  0.942575
round_time_test        0 days 00:00:10.560403
round_time_total       0 days 00:00:12.406864 

=== epoch 1/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
 72%|███████▏  | 1443/2000 [00:01<00:00, 868.13it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [00:02<00:00, 911.68it/s]
episodes                                   74
episode_length                     134.256757
returns                            -77.917142
return_std                         166.786807
average_reward                      -0.578498
round_time             0 days 00:00:02.752154
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       959.226716
return_std_test                      13.53612
average_reward_test                  0.959227
round_time_test        0 days 00:00:10.749544
round_time_total       0 days 00:00:12.517221 

=== epoch 1/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 0/2000 [00:00<?, ?it/s]/<ANONYMIZED PATH>/rmst-rlrd/rlrd/nn.py:41: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  assert b.storage().data_ptr() == a.storage().data_ptr()
  0%|          | 7/2000 [00:01<06:47,  4.89it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [05:53<00:00,  5.66it/s]
starting training
episodes                                   65
episode_length                     148.907692
returns                            -81.636913
return_std                         175.016313
average_reward                       -0.55048
round_time             0 days 00:05:53.990684
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       945.466758
return_std_test                     16.811375
average_reward_test                  0.945467
round_time_test        0 days 00:00:10.800411
round_time_total       0 days 00:05:53.991824
loss_total                         399.876556
loss_critic                        517.175858
loss_actor                         -69.320692
memory_size                         9585.5155 

=== epoch 1/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:44,  4.92it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [05:59<00:00,  5.57it/s]
episodes                                   75
episode_length                     126.746667
returns                            -71.924966
return_std                         154.349483
average_reward                      -0.569573
round_time             0 days 00:05:59.712341
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       563.199252
return_std_test                      45.81302
average_reward_test                  0.563199
round_time_test        0 days 00:00:10.883040
round_time_total       0 days 00:05:59.713428
loss_total                         417.165312
loss_critic                        557.030449
loss_actor                        -142.295273
memory_size                         11378.189 

=== epoch 1/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:46,  4.90it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:01<00:00,  5.53it/s]
episodes                                   57
episode_length                     152.175439
returns                            -90.626419
return_std                         179.036194
average_reward                       -0.59697
round_time             0 days 00:06:02.510498
episodes_test                            15.0
episode_length_test                653.066667
returns_test                       335.062219
return_std_test                    201.460776
average_reward_test                    0.5081
round_time_test        0 days 00:00:10.611335
round_time_total       0 days 00:06:02.511682
loss_total                         412.946167
loss_critic                        566.246859
loss_actor                        -200.256638
memory_size                        13036.5335 

=== epoch 1/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:06,  5.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.51it/s]
episodes                                   63
episode_length                     129.285714
returns                            -77.410733
return_std                         162.431227
average_reward                      -0.602875
round_time             0 days 00:06:03.661229
episodes_test                            16.0
episode_length_test                  595.8125
returns_test                        315.98125
return_std_test                    242.601219
average_reward_test                  0.531322
round_time_test        0 days 00:00:10.551590
round_time_total       0 days 00:06:03.662339
loss_total                         416.991325
loss_critic                         580.15267
loss_actor                        -235.654097
memory_size                         14834.443 

=== epoch 1/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<05:56,  5.59it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.45it/s]
episodes                                   80
episode_length                        124.225
returns                            -78.160046
return_std                          166.59773
average_reward                      -0.630717
round_time             0 days 00:06:07.440534
episodes_test                            28.0
episode_length_test                335.714286
returns_test                       167.336343
return_std_test                    199.061493
average_reward_test                    0.5016
round_time_test        0 days 00:00:10.473003
round_time_total       0 days 00:06:07.441657
loss_total                         457.905815
loss_critic                        634.503777
loss_actor                        -248.486075
memory_size                        16568.2535 

=== epoch 1/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   87
episode_length                     101.816092
returns                            -67.698871
return_std                          147.54806
average_reward                      -0.655228
round_time             0 days 00:06:10.676920
episodes_test                            18.0
episode_length_test                541.388889
returns_test                       270.020646
return_std_test                    251.376601
average_reward_test                  0.496564
round_time_test        0 days 00:00:10.714981
round_time_total       0 days 00:06:10.678020
loss_total                         482.345945
loss_critic                        665.160121
loss_actor                        -248.910807
memory_size                        18226.9635 

=== epoch 1/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:25,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:08<00:00,  5.42it/s]
episodes                                   83
episode_length                     111.253012
returns                            -75.148644
return_std                         163.569206
average_reward                      -0.676508
round_time             0 days 00:06:09.200880
episodes_test                            27.0
episode_length_test                350.888889
returns_test                       182.150948
return_std_test                    252.313075
average_reward_test                  0.524155
round_time_test        0 days 00:00:10.650621
round_time_total       0 days 00:06:09.202001
loss_total                         476.574508
loss_critic                        656.896711
loss_actor                        -244.714349
memory_size                        19895.4705 

=== epoch 1/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:05,  5.45it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                  100
episode_length                          89.55
returns                            -58.483692
return_std                         136.844218
average_reward                      -0.667352
round_time             0 days 00:06:11.189168
episodes_test                            21.0
episode_length_test                458.285714
returns_test                       268.046359
return_std_test                    308.973109
average_reward_test                  0.589847
round_time_test        0 days 00:00:10.601670
round_time_total       0 days 00:06:11.190294
loss_total                         494.004635
loss_critic                        677.207014
loss_actor                         -238.80493
memory_size                        21510.7105 

=== epoch 1/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:56,  4.78it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.36it/s]
episodes                                  117
episode_length                      84.931624
returns                            -58.187155
return_std                         130.959615
average_reward                       -0.68541
round_time             0 days 00:06:13.432234
episodes_test                            17.0
episode_length_test                537.470588
returns_test                       268.959506
return_std_test                    260.204083
average_reward_test                  0.498063
round_time_test        0 days 00:00:10.613778
round_time_total       0 days 00:06:13.433797
loss_total                         524.909939
loss_critic                        713.890817
loss_actor                        -231.013621
memory_size                        23107.5995 

=== epoch 1/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:03,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.35it/s]
episodes                                  120
episode_length                      83.033333
returns                            -56.364376
return_std                         128.710266
average_reward                        -0.6795
round_time             0 days 00:06:14.100822
episodes_test                            17.0
episode_length_test                556.117647
returns_test                       297.430707
return_std_test                    284.323303
average_reward_test                  0.530594
round_time_test        0 days 00:00:10.607465
round_time_total       0 days 00:06:14.101967
loss_total                         539.074741
loss_critic                        728.469281
loss_actor                        -218.503467
memory_size                         24686.221 

=== epoch 1/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:03,  4.70it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:15<00:00,  5.33it/s]
episodes                                  108
episode_length                      85.481481
returns                            -58.989592
return_std                         129.735163
average_reward                      -0.694298
round_time             0 days 00:06:15.659865
episodes_test                            25.0
episode_length_test                     393.6
returns_test                       218.899201
return_std_test                    266.725395
average_reward_test                  0.556277
round_time_test        0 days 00:00:10.572234
round_time_total       0 days 00:06:15.661011
loss_total                         497.350262
loss_critic                          673.6401
loss_actor                        -207.809139
memory_size                         26320.351 

=== epoch 1/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.92it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.33it/s]
episodes                                  107
episode_length                      90.981308
returns                            -60.789949
return_std                         131.257506
average_reward                      -0.669447
round_time             0 days 00:06:15.679581
episodes_test                            20.0
episode_length_test                     486.6
returns_test                        277.31041
return_std_test                    300.683398
average_reward_test                   0.56132
round_time_test        0 days 00:00:10.873603
round_time_total       0 days 00:06:15.680769
loss_total                         492.544594
loss_critic                        665.722823
loss_actor                        -200.168362
memory_size                         28170.033 

=== epoch 1/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:36,  5.02it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.30it/s]
episodes                                  104
episode_length                      93.826923
returns                            -62.965145
return_std                         124.948236
average_reward                      -0.670952
round_time             0 days 00:06:17.681433
episodes_test                            28.0
episode_length_test                340.392857
returns_test                       167.647507
return_std_test                    239.573262
average_reward_test                  0.496894
round_time_test        0 days 00:00:10.807572
round_time_total       0 days 00:06:17.682560
loss_total                         515.846522
loss_critic                        693.046473
loss_actor                        -192.953327
memory_size                         29880.494 

=== epoch 1/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:56,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.31it/s]
episodes                                   86
episode_length                     111.802326
returns                            -72.382436
return_std                         148.505819
average_reward                      -0.644411
round_time             0 days 00:06:17.565972
episodes_test                            24.0
episode_length_test                    403.25
returns_test                       242.729862
return_std_test                    321.144282
average_reward_test                  0.597226
round_time_test        0 days 00:00:10.767452
round_time_total       0 days 00:06:17.567309
loss_total                         531.172245
loss_critic                        710.685028
loss_actor                         -186.87894
memory_size                         31472.787 

=== epoch 1/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:26,  5.16it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.26it/s]
episodes                                   71
episode_length                     139.633803
returns                             -88.22948
return_std                         169.606854
average_reward                      -0.626723
round_time             0 days 00:06:20.759108
episodes_test                            23.0
episode_length_test                427.478261
returns_test                       245.776851
return_std_test                    310.279453
average_reward_test                  0.565183
round_time_test        0 days 00:00:10.795344
round_time_total       0 days 00:06:20.760210
loss_total                         486.185165
loss_critic                        653.555354
loss_actor                        -183.295634
memory_size                         33333.661 

=== epoch 1/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   87
episode_length                     105.471264
returns                            -65.933061
return_std                         140.771949
average_reward                      -0.626438
round_time             0 days 00:06:21.963116
episodes_test                            21.0
episode_length_test                433.380952
returns_test                       292.355063
return_std_test                    345.010594
average_reward_test                  0.676948
round_time_test        0 days 00:00:10.837023
round_time_total       0 days 00:06:21.964226
loss_total                         508.743113
loss_critic                        681.183177
loss_actor                        -181.017195
memory_size                        34990.1175 

=== epoch 1/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:17,  5.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.25it/s]
episodes                                   84
episode_length                     104.654762
returns                            -65.494491
return_std                         140.647994
average_reward                       -0.62336
round_time             0 days 00:06:21.495327
episodes_test                            26.0
episode_length_test                374.038462
returns_test                        220.28423
return_std_test                    328.143239
average_reward_test                  0.578693
round_time_test        0 days 00:00:10.668056
round_time_total       0 days 00:06:21.496632
loss_total                         522.616014
loss_critic                        697.816103
loss_actor                        -178.184394
memory_size                        36664.8915 

=== epoch 1/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.26it/s]
episodes                                   71
episode_length                     127.309859
returns                            -78.524533
return_std                         162.656392
average_reward                       -0.61801
round_time             0 days 00:06:21.059076
episodes_test                            25.0
episode_length_test                    370.24
returns_test                       242.664077
return_std_test                    335.012903
average_reward_test                  0.646261
round_time_test        0 days 00:00:10.687016
round_time_total       0 days 00:06:21.060247
loss_total                         491.572461
loss_critic                        658.345859
loss_actor                        -175.521174
memory_size                         38495.425 

=== epoch 1/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   87
episode_length                     107.218391
returns                            -65.807464
return_std                         139.277752
average_reward                      -0.612153
round_time             0 days 00:06:23.754670
episodes_test                            25.0
episode_length_test                    366.84
returns_test                       240.637895
return_std_test                    335.844951
average_reward_test                  0.666537
round_time_test        0 days 00:00:10.616912
round_time_total       0 days 00:06:23.755801
loss_total                         522.430192
loss_critic                        696.415755
loss_actor                        -173.512108
memory_size                         40189.141 

=== epoch 1/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:44,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   83
episode_length                     109.445783
returns                            -67.009237
return_std                         140.293622
average_reward                      -0.608324
round_time             0 days 00:06:23.894661
episodes_test                            29.0
episode_length_test                340.551724
returns_test                       192.762206
return_std_test                    320.270453
average_reward_test                  0.555807
round_time_test        0 days 00:00:10.760811
round_time_total       0 days 00:06:23.895780
loss_total                         519.572937
loss_critic                        692.112495
loss_actor                        -170.585341
memory_size                         41908.225 

=== epoch 1/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   61
episode_length                     151.540984
returns                             -89.82807
return_std                         172.277106
average_reward                      -0.591042
round_time             0 days 00:06:25.676509
episodes_test                            25.0
episode_length_test                    368.92
returns_test                       237.408441
return_std_test                    337.238317
average_reward_test                  0.655084
round_time_test        0 days 00:00:10.763196
round_time_total       0 days 00:06:25.677746
loss_total                         488.085639
loss_critic                          652.4388
loss_actor                         -169.32705
memory_size                        43819.8715 

=== epoch 1/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:24,  4.48it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   64
episode_length                     143.421875
returns                            -84.572104
return_std                          167.55247
average_reward                      -0.590734
round_time             0 days 00:06:28.014597
episodes_test                            30.0
episode_length_test                326.833333
returns_test                       200.336861
return_std_test                    329.667426
average_reward_test                  0.620791
round_time_test        0 days 00:00:10.650511
round_time_total       0 days 00:06:28.015714
loss_total                         473.052169
loss_critic                         633.83496
loss_actor                        -170.079039
memory_size                         45690.662 

=== epoch 1/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:23,  5.20it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.21it/s]
episodes                                   79
episode_length                     123.151899
returns                            -72.007851
return_std                         150.451853
average_reward                      -0.581911
round_time             0 days 00:06:24.389316
episodes_test                            29.0
episode_length_test                325.068966
returns_test                       171.024672
return_std_test                    295.013698
average_reward_test                   0.54707
round_time_test        0 days 00:00:10.683738
round_time_total       0 days 00:06:24.390433
loss_total                         487.276315
loss_critic                        651.562199
loss_actor                         -169.86727
memory_size                        47373.7735 

=== epoch 1/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:28,  4.44it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   59
episode_length                      153.20339
returns                            -88.148275
return_std                         179.763254
average_reward                      -0.580469
round_time             0 days 00:06:28.128799
episodes_test                            31.0
episode_length_test                318.580645
returns_test                       194.957805
return_std_test                    318.486248
average_reward_test                  0.604519
round_time_test        0 days 00:00:10.943184
round_time_total       0 days 00:06:28.130076
loss_total                         487.452475
loss_critic                        651.316206
loss_actor                        -168.002501
memory_size                         49074.341 

=== epoch 1/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   63
episode_length                     146.984127
returns                            -86.614069
return_std                         178.917439
average_reward                      -0.589456
round_time             0 days 00:06:27.089493
episodes_test                            22.0
episode_length_test                419.409091
returns_test                       292.230948
return_std_test                    384.117532
average_reward_test                  0.702397
round_time_test        0 days 00:00:10.632835
round_time_total       0 days 00:06:27.090599
loss_total                         479.102453
loss_critic                        640.942801
loss_actor                        -168.258983
memory_size                        50908.2295 

=== epoch 1/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:51,  4.84it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   72
episode_length                     121.972222
returns                            -73.523782
return_std                         157.615785
average_reward                      -0.596709
round_time             0 days 00:06:29.512879
episodes_test                            43.0
episode_length_test                210.906977
returns_test                       121.573124
return_std_test                    286.212288
average_reward_test                  0.596956
round_time_test        0 days 00:00:10.760556
round_time_total       0 days 00:06:29.514170
loss_total                         488.603214
loss_critic                         652.80186
loss_actor                        -168.191419
memory_size                         52678.217 

=== epoch 1/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.96it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   72
episode_length                         125.75
returns                            -78.361697
return_std                         157.555494
average_reward                      -0.617807
round_time             0 days 00:06:28.166399
episodes_test                            29.0
episode_length_test                319.689655
returns_test                        213.21643
return_std_test                    348.994973
average_reward_test                  0.671893
round_time_test        0 days 00:00:10.822374
round_time_total       0 days 00:06:28.167475
loss_total                         495.655377
loss_critic                        661.542679
loss_actor                        -167.893874
memory_size                         54448.717 

=== epoch 1/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:18,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   68
episode_length                     133.485294
returns                            -86.883634
return_std                         162.581293
average_reward                      -0.645424
round_time             0 days 00:06:29.793356
episodes_test                            29.0
episode_length_test                332.103448
returns_test                       225.056837
return_std_test                    339.147523
average_reward_test                  0.659867
round_time_test        0 days 00:00:10.763178
round_time_total       0 days 00:06:29.794483
loss_total                         535.413948
loss_critic                        710.830525
loss_actor                        -166.252409
memory_size                         56074.122 

=== epoch 1/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:59,  4.75it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   78
episode_length                     114.269231
returns                            -73.866979
return_std                         135.922669
average_reward                      -0.645896
round_time             0 days 00:06:31.447864
episodes_test                            30.0
episode_length_test                331.266667
returns_test                       232.553333
return_std_test                    372.433606
average_reward_test                  0.698812
round_time_test        0 days 00:00:10.729355
round_time_total       0 days 00:06:31.448972
loss_total                          528.37957
loss_critic                         701.61155
loss_actor                        -164.548399
memory_size                        57863.2255 

=== epoch 1/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   88
episode_length                     106.590909
returns                            -69.401282
return_std                         127.758622
average_reward                      -0.643927
round_time             0 days 00:06:32.218224
episodes_test                            17.0
episode_length_test                557.470588
returns_test                       407.147432
return_std_test                    385.497331
average_reward_test                  0.736057
round_time_test        0 days 00:00:10.789759
round_time_total       0 days 00:06:32.219313
loss_total                         527.143438
loss_critic                        699.745079
loss_actor                        -163.263177
memory_size                        59529.9305 

=== epoch 1/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   79
episode_length                     123.240506
returns                            -77.192984
return_std                         141.446131
average_reward                      -0.621594
round_time             0 days 00:06:31.801911
episodes_test                            33.0
episode_length_test                285.969697
returns_test                       189.038226
return_std_test                    330.853384
average_reward_test                   0.66614
round_time_test        0 days 00:00:10.818117
round_time_total       0 days 00:06:31.802992
loss_total                         515.794042
loss_critic                        685.266755
loss_actor                        -162.096862
memory_size                        61304.3955 

=== epoch 1/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   63
episode_length                     154.666667
returns                            -90.867997
return_std                         159.112536
average_reward                      -0.592433
round_time             0 days 00:06:33.113568
episodes_test                            27.0
episode_length_test                365.851852
returns_test                       211.388808
return_std_test                    312.255528
average_reward_test                  0.582809
round_time_test        0 days 00:00:10.705370
round_time_total       0 days 00:06:33.114659
loss_total                         493.100547
loss_critic                        657.075961
loss_actor                        -162.801158
memory_size                         63235.855 

=== epoch 1/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   53
episode_length                     156.358491
returns                            -89.370491
return_std                         158.616994
average_reward                      -0.569295
round_time             0 days 00:06:37.461348
episodes_test                            24.0
episode_length_test                407.708333
returns_test                       288.301404
return_std_test                    373.500583
average_reward_test                  0.696979
round_time_test        0 days 00:00:10.772210
round_time_total       0 days 00:06:37.462428
loss_total                         479.688216
loss_critic                        640.882377
loss_actor                        -165.088473
memory_size                          65129.38 

=== epoch 1/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   44
episode_length                     194.454545
returns                           -113.372415
return_std                         190.687703
average_reward                      -0.575091
round_time             0 days 00:06:34.431728
episodes_test                            18.0
episode_length_test                529.666667
returns_test                       345.123594
return_std_test                    374.088432
average_reward_test                  0.662609
round_time_test        0 days 00:00:10.815774
round_time_total       0 days 00:06:34.432825
loss_total                          481.99487
loss_critic                        644.118574
loss_actor                        -166.499987
memory_size                         66970.366 

=== epoch 1/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:29,  4.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   50
episode_length                          184.6
returns                           -104.208505
return_std                         186.344214
average_reward                      -0.562986
round_time             0 days 00:06:35.405903
episodes_test                            38.0
episode_length_test                245.921053
returns_test                       159.663266
return_std_test                    319.749825
average_reward_test                  0.658373
round_time_test        0 days 00:00:10.883998
round_time_total       0 days 00:06:35.407006
loss_total                         483.944891
loss_critic                        646.701111
loss_actor                        -167.080031
memory_size                        68699.0105 

=== epoch 1/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:37,  4.36it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   57
episode_length                     152.596491
returns                            -87.353947
return_std                         168.910717
average_reward                      -0.564679
round_time             0 days 00:06:37.109571
episodes_test                            25.0
episode_length_test                    372.88
returns_test                        253.25738
return_std_test                    374.580418
average_reward_test                  0.675568
round_time_test        0 days 00:00:10.874978
round_time_total       0 days 00:06:37.110707
loss_total                         491.351991
loss_critic                        656.204709
loss_actor                        -168.058927
memory_size                        70348.9965 

=== epoch 1/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.73it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   68
episode_length                     135.161765
returns                            -80.347432
return_std                         157.231456
average_reward                      -0.592289
round_time             0 days 00:06:35.191288
episodes_test                            21.0
episode_length_test                467.428571
returns_test                       334.136051
return_std_test                    382.861279
average_reward_test                  0.705621
round_time_test        0 days 00:00:10.716692
round_time_total       0 days 00:06:35.192371
loss_total                         477.645438
loss_critic                        639.213538
loss_actor                        -168.627003
memory_size                        72182.3925 

=== epoch 1/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   67
episode_length                     141.656716
returns                            -86.312385
return_std                         155.922639
average_reward                      -0.620986
round_time             0 days 00:06:39.294445
episodes_test                            28.0
episode_length_test                347.071429
returns_test                       205.511973
return_std_test                    321.663301
average_reward_test                  0.594714
round_time_test        0 days 00:00:10.857555
round_time_total       0 days 00:06:39.295645
loss_total                         488.741325
loss_critic                        653.368751
loss_actor                        -169.768424
memory_size                        74014.6815 

=== epoch 1/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   65
episode_length                     146.123077
returns                            -89.000742
return_std                         158.635742
average_reward                      -0.609168
round_time             0 days 00:06:39.035650
episodes_test                            23.0
episode_length_test                394.478261
returns_test                       267.962548
return_std_test                    352.430154
average_reward_test                   0.68901
round_time_test        0 days 00:00:10.732425
round_time_total       0 days 00:06:39.037147
loss_total                         511.776527
loss_critic                        682.014631
loss_actor                        -169.175936
memory_size                         75859.974 

=== epoch 1/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   61
episode_length                     150.295082
returns                            -93.871151
return_std                         156.624191
average_reward                      -0.616243
round_time             0 days 00:06:39.715565
episodes_test                            17.0
episode_length_test                559.117647
returns_test                       425.946047
return_std_test                    406.771732
average_reward_test                  0.757646
round_time_test        0 days 00:00:10.805607
round_time_total       0 days 00:06:39.716678
loss_total                         528.144115
loss_critic                        702.425702
loss_actor                        -168.982281
memory_size                         77566.065 

=== epoch 1/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:36,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   56
episode_length                     158.089286
returns                           -100.573809
return_std                          161.62501
average_reward                      -0.618477
round_time             0 days 00:06:40.558408
episodes_test                            20.0
episode_length_test                     452.4
returns_test                       342.369497
return_std_test                    415.041327
average_reward_test                  0.764732
round_time_test        0 days 00:00:10.776556
round_time_total       0 days 00:06:40.559544
loss_total                         520.929312
loss_critic                        693.219934
loss_actor                        -168.233221
memory_size                        79396.0945 

=== epoch 1/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  5.00it/s]
episodes                                   61
episode_length                     155.590164
returns                            -93.043381
return_std                         151.662677
average_reward                      -0.592049
round_time             0 days 00:06:40.764532
episodes_test                            23.0
episode_length_test                434.565217
returns_test                       318.426932
return_std_test                    377.378353
average_reward_test                  0.732008
round_time_test        0 days 00:00:10.919748
round_time_total       0 days 00:06:40.765624
loss_total                         515.500423
loss_critic                        686.482142
loss_actor                        -168.426503
memory_size                         81223.693 

=== epoch 1/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:15,  4.58it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   67
episode_length                     131.134328
returns                            -74.276593
return_std                          122.98051
average_reward                      -0.559009
round_time             0 days 00:06:41.051560
episodes_test                            16.0
episode_length_test                  613.9375
returns_test                        439.85474
return_std_test                    370.007346
average_reward_test                  0.703658
round_time_test        0 days 00:00:10.710317
round_time_total       0 days 00:06:41.052672
loss_total                         525.564438
loss_critic                        699.389512
loss_actor                        -169.735915
memory_size                          82925.75 

=== epoch 1/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.89it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  5.00it/s]
episodes                                   75
episode_length                         132.72
returns                            -76.585578
return_std                         129.778847
average_reward                      -0.573905
round_time             0 days 00:06:40.660389
episodes_test                            29.0
episode_length_test                311.172414
returns_test                       200.840115
return_std_test                    345.696015
average_reward_test                  0.668858
round_time_test        0 days 00:00:10.746944
round_time_total       0 days 00:06:40.661502
loss_total                          536.63273
loss_critic                        713.398686
loss_actor                        -170.431142
memory_size                         84649.436 

=== epoch 1/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   92
episode_length                     106.467391
returns                            -60.511788
return_std                         109.711786
average_reward                      -0.567906
round_time             0 days 00:06:42.067435
episodes_test                            25.0
episode_length_test                     385.0
returns_test                       261.525257
return_std_test                    352.278627
average_reward_test                  0.678505
round_time_test        0 days 00:00:10.789845
round_time_total       0 days 00:06:42.068529
loss_total                         546.892946
loss_critic                        726.398846
loss_actor                        -171.130711
memory_size                          86210.47 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 2/10 ===== round 1/50 ======================================
  1%|          | 11/2000 [00:02<06:26,  5.15it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:00<00:00,  5.54it/s]
episodes                                   23
episode_length                      85.217391
returns                            -42.047712
return_std                          88.974185
average_reward                      -0.505026
round_time             0 days 00:06:00.920017
episodes_test                            36.0
episode_length_test                252.861111
returns_test                       166.120769
return_std_test                    315.017409
average_reward_test                  0.673947
round_time_test        0 days 00:00:10.672476
round_time_total       0 days 00:06:00.921115
loss_total                         566.797417
loss_critic                        751.314257
loss_actor                        -171.269997
memory_size                         87669.857 

=== epoch 2/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:30,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:02<00:00,  5.51it/s]
episodes                                   37
episode_length                     101.351351
returns                            -51.435837
return_std                          98.030128
average_reward                       -0.50309
round_time             0 days 00:06:03.253310
episodes_test                            32.0
episode_length_test                 301.28125
returns_test                       214.390183
return_std_test                    349.882605
average_reward_test                  0.707857
round_time_test        0 days 00:00:10.719731
round_time_total       0 days 00:06:03.254606
loss_total                         564.813063
loss_critic                        748.967174
loss_actor                        -171.803436
memory_size                        89381.8935 

=== epoch 2/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:03,  5.48it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.50it/s]
episodes                                   60
episode_length                          98.95
returns                            -55.513282
return_std                          96.337028
average_reward                      -0.561283
round_time             0 days 00:06:04.289296
episodes_test                            26.0
episode_length_test                     377.0
returns_test                       257.284242
return_std_test                    347.150701
average_reward_test                  0.670617
round_time_test        0 days 00:00:10.726556
round_time_total       0 days 00:06:04.290510
loss_total                         552.979543
loss_critic                         734.54691
loss_actor                        -173.289983
memory_size                         91160.087 

=== epoch 2/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:08,  5.40it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:05<00:00,  5.48it/s]
episodes                                   83
episode_length                      93.710843
returns                            -52.318332
return_std                          99.321419
average_reward                      -0.562549
round_time             0 days 00:06:05.688230
episodes_test                            25.0
episode_length_test                     380.2
returns_test                       285.805694
return_std_test                    376.514469
average_reward_test                  0.751753
round_time_test        0 days 00:00:10.630731
round_time_total       0 days 00:06:05.689324
loss_total                         569.248094
loss_critic                        755.090644
loss_actor                        -174.122156
memory_size                        92790.7935 

=== epoch 2/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:07<00:00,  5.44it/s]
episodes                                   95
episode_length                      97.684211
returns                            -54.272873
return_std                          107.27892
average_reward                      -0.546248
round_time             0 days 00:06:07.995533
episodes_test                            26.0
episode_length_test                382.307692
returns_test                       259.922946
return_std_test                    339.085251
average_reward_test                  0.678011
round_time_test        0 days 00:00:11.024899
round_time_total       0 days 00:06:07.996855
loss_total                         557.090333
loss_critic                        740.153493
loss_actor                        -175.162356
memory_size                         94452.027 

=== epoch 2/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   91
episode_length                     101.901099
returns                            -55.751447
return_std                         109.557058
average_reward                      -0.549943
round_time             0 days 00:06:10.555820
episodes_test                            40.0
episode_length_test                    247.75
returns_test                       148.431446
return_std_test                    276.211486
average_reward_test                  0.599114
round_time_test        0 days 00:00:10.965809
round_time_total       0 days 00:06:10.557097
loss_total                         566.908921
loss_critic                        752.767187
loss_actor                        -176.524198
memory_size                        96162.8625 

=== epoch 2/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:09<00:00,  5.41it/s]
episodes                                   88
episode_length                     103.306818
returns                            -58.947753
return_std                         117.719282
average_reward                      -0.567039
round_time             0 days 00:06:10.396445
episodes_test                            22.0
episode_length_test                414.772727
returns_test                       280.972562
return_std_test                    367.122588
average_reward_test                  0.680298
round_time_test        0 days 00:00:10.810119
round_time_total       0 days 00:06:10.397692
loss_total                         559.412005
loss_critic                        743.766532
loss_actor                        -178.006153
memory_size                        97954.2715 

=== epoch 2/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:11,  5.36it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.36it/s]
episodes                                   73
episode_length                     128.082192
returns                            -68.919778
return_std                         138.532215
average_reward                      -0.536347
round_time             0 days 00:06:13.519779
episodes_test                            19.0
episode_length_test                480.526316
returns_test                       311.659503
return_std_test                    369.815497
average_reward_test                  0.662157
round_time_test        0 days 00:00:10.910055
round_time_total       0 days 00:06:13.520871
loss_total                         553.173976
loss_critic                         736.65397
loss_actor                        -180.746058
memory_size                          99778.78 

=== epoch 2/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.31it/s]
episodes                                   57
episode_length                     152.578947
returns                            -79.571346
return_std                         147.443677
average_reward                      -0.518485
round_time             0 days 00:06:17.143964
episodes_test                            25.0
episode_length_test                    376.04
returns_test                       236.768305
return_std_test                    339.217897
average_reward_test                   0.62287
round_time_test        0 days 00:00:10.968162
round_time_total       0 days 00:06:17.145248
loss_total                         544.901239
loss_critic                         726.78804
loss_actor                        -182.646018
memory_size                       101630.4885 

=== epoch 2/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.98it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:15<00:00,  5.32it/s]
episodes                                   56
episode_length                     172.428571
returns                            -89.837636
return_std                         156.302906
average_reward                      -0.524031
round_time             0 days 00:06:16.215493
episodes_test                            14.0
episode_length_test                671.571429
returns_test                       471.777515
return_std_test                    329.453679
average_reward_test                  0.695101
round_time_test        0 days 00:00:10.846924
round_time_total       0 days 00:06:16.216585
loss_total                          535.60652
loss_critic                        715.575785
loss_actor                        -184.270591
memory_size                       103510.9415 

=== epoch 2/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:09,  5.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.30it/s]
episodes                                   53
episode_length                     171.188679
returns                            -92.731614
return_std                         148.729252
average_reward                      -0.535836
round_time             0 days 00:06:17.982146
episodes_test                            35.0
episode_length_test                282.685714
returns_test                       174.853778
return_std_test                    312.659123
average_reward_test                  0.615577
round_time_test        0 days 00:00:10.865325
round_time_total       0 days 00:06:17.983266
loss_total                         565.701839
loss_critic                        753.428704
loss_actor                        -185.205672
memory_size                       105209.8715 

=== epoch 2/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:19,  5.25it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.29it/s]
episodes                                   49
episode_length                     180.653061
returns                            -94.494725
return_std                         147.652279
average_reward                      -0.513065
round_time             0 days 00:06:18.949952
episodes_test                            28.0
episode_length_test                321.821429
returns_test                       223.023381
return_std_test                    322.192413
average_reward_test                  0.689515
round_time_test        0 days 00:00:10.648601
round_time_total       0 days 00:06:18.951470
loss_total                         548.662728
loss_critic                        732.184969
loss_actor                        -185.426283
memory_size                       107068.9595 

=== epoch 2/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:19<00:00,  5.27it/s]
episodes                                   46
episode_length                     190.326087
returns                            -99.791535
return_std                          149.40118
average_reward                      -0.519863
round_time             0 days 00:06:20.231889
episodes_test                            48.0
episode_length_test                    195.75
returns_test                       113.843117
return_std_test                    273.331573
average_reward_test                  0.592949
round_time_test        0 days 00:00:10.937442
round_time_total       0 days 00:06:20.233119
loss_total                         541.100946
loss_critic                        723.269374
loss_actor                        -187.572818
memory_size                       108956.5415 

=== epoch 2/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:18,  5.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.25it/s]
episodes                                   50
episode_length                         179.44
returns                            -93.592271
return_std                         148.623504
average_reward                      -0.513372
round_time             0 days 00:06:21.504531
episodes_test                            20.0
episode_length_test                    457.75
returns_test                       345.953443
return_std_test                    392.593333
average_reward_test                  0.766859
round_time_test        0 days 00:00:11.032199
round_time_total       0 days 00:06:21.505640
loss_total                         539.944581
loss_critic                        722.304075
loss_actor                        -189.493451
memory_size                       110822.8215 

=== epoch 2/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:37,  5.01it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   52
episode_length                     180.942308
returns                            -94.934695
return_std                           144.5292
average_reward                      -0.520831
round_time             0 days 00:06:23.076157
episodes_test                            16.0
episode_length_test                  617.6875
returns_test                       485.899158
return_std_test                    394.833844
average_reward_test                  0.774567
round_time_test        0 days 00:00:10.855078
round_time_total       0 days 00:06:23.077275
loss_total                         538.522594
loss_critic                        720.994123
loss_actor                        -191.363572
memory_size                       112609.9555 

=== epoch 2/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:30,  5.10it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.21it/s]
episodes                                   43
episode_length                     217.209302
returns                           -108.480992
return_std                          167.03294
average_reward                      -0.505256
round_time             0 days 00:06:24.162724
episodes_test                            26.0
episode_length_test                382.576923
returns_test                       255.793863
return_std_test                    342.918743
average_reward_test                  0.667548
round_time_test        0 days 00:00:10.665932
round_time_total       0 days 00:06:24.163834
loss_total                         533.301928
loss_critic                        714.744662
loss_actor                         -192.46906
memory_size                       114450.0725 

=== epoch 2/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:21,  5.22it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   47
episode_length                     181.617021
returns                            -95.397934
return_std                         153.375279
average_reward                      -0.525297
round_time             0 days 00:06:22.571881
episodes_test                            30.0
episode_length_test                317.366667
returns_test                       186.069139
return_std_test                    287.916822
average_reward_test                  0.587484
round_time_test        0 days 00:00:10.764354
round_time_total       0 days 00:06:22.573115
loss_total                         541.302536
loss_critic                        725.269041
loss_actor                        -194.563535
memory_size                        116288.029 

=== epoch 2/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:01,  4.73it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   50
episode_length                         186.98
returns                            -94.986435
return_std                          159.89524
average_reward                      -0.509787
round_time             0 days 00:06:24.074363
episodes_test                            18.0
episode_length_test                534.333333
returns_test                       384.049293
return_std_test                    374.210386
average_reward_test                  0.720872
round_time_test        0 days 00:00:10.932976
round_time_total       0 days 00:06:24.075453
loss_total                         533.425662
loss_critic                        715.704212
loss_actor                        -195.688587
memory_size                       118123.2995 

=== epoch 2/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:57,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   51
episode_length                     185.843137
returns                            -97.297167
return_std                         162.029958
average_reward                      -0.522005
round_time             0 days 00:06:25.058569
episodes_test                            27.0
episode_length_test                350.888889
returns_test                       211.360584
return_std_test                    335.960317
average_reward_test                  0.604868
round_time_test        0 days 00:00:10.864409
round_time_total       0 days 00:06:25.060016
loss_total                         516.340258
loss_critic                        695.082374
loss_actor                        -198.628254
memory_size                        120021.592 

=== epoch 2/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:53,  4.21it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.21it/s]
episodes                                   54
episode_length                     171.777778
returns                            -90.508786
return_std                         160.363023
average_reward                      -0.531421
round_time             0 days 00:06:24.472840
episodes_test                            16.0
episode_length_test                  587.3125
returns_test                       450.222424
return_std_test                    392.347572
average_reward_test                  0.763668
round_time_test        0 days 00:00:10.889330
round_time_total       0 days 00:06:24.473932
loss_total                         523.484731
loss_critic                        704.487729
loss_actor                        -200.527315
memory_size                        121752.286 

=== epoch 2/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:22,  5.20it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   75
episode_length                     123.146667
returns                            -67.997779
return_std                         131.347335
average_reward                      -0.544269
round_time             0 days 00:06:27.558468
episodes_test                            31.0
episode_length_test                291.967742
returns_test                        177.40644
return_std_test                    322.595971
average_reward_test                  0.630313
round_time_test        0 days 00:00:10.848824
round_time_total       0 days 00:06:27.559601
loss_total                         549.267686
loss_critic                        736.934285
loss_actor                        -201.398762
memory_size                        123421.686 

=== epoch 2/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:33,  4.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   75
episode_length                     124.186667
returns                             -68.69275
return_std                         129.427654
average_reward                      -0.545835
round_time             0 days 00:06:28.303576
episodes_test                            20.0
episode_length_test                    472.65
returns_test                       338.913696
return_std_test                    376.151541
average_reward_test                  0.719796
round_time_test        0 days 00:00:10.912393
round_time_total       0 days 00:06:28.304828
loss_total                         562.507908
loss_critic                        753.720558
loss_actor                        -202.342749
memory_size                        125026.066 

=== epoch 2/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:48,  4.87it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   73
episode_length                     127.438356
returns                            -68.298845
return_std                         123.085433
average_reward                      -0.530044
round_time             0 days 00:06:29.599681
episodes_test                            26.0
episode_length_test                     382.0
returns_test                        229.49498
return_std_test                    318.188632
average_reward_test                  0.597743
round_time_test        0 days 00:00:10.761223
round_time_total       0 days 00:06:29.600793
loss_total                         550.598237
loss_critic                        739.216012
loss_actor                        -203.872921
memory_size                       126868.4495 

=== epoch 2/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.15it/s]
episodes                                   68
episode_length                     136.338235
returns                            -71.735899
return_std                         123.113528
average_reward                      -0.521558
round_time             0 days 00:06:29.156146
episodes_test                            20.0
episode_length_test                    493.95
returns_test                       312.517258
return_std_test                    341.010874
average_reward_test                   0.62655
round_time_test        0 days 00:00:11.067029
round_time_total       0 days 00:06:29.157277
loss_total                         548.293432
loss_critic                        736.811336
loss_actor                        -205.778241
memory_size                       128765.5615 

=== epoch 2/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.96it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.10it/s]
episodes                                   64
episode_length                      144.90625
returns                            -73.184215
return_std                          118.93805
average_reward                       -0.50139
round_time             0 days 00:06:33.016076
episodes_test                            18.0
episode_length_test                509.666667
returns_test                       321.610251
return_std_test                    327.149725
average_reward_test                  0.648885
round_time_test        0 days 00:00:10.694686
round_time_total       0 days 00:06:33.017524
loss_total                         541.486439
loss_critic                        729.051489
loss_actor                        -208.773816
memory_size                       130594.1285 

=== epoch 2/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:36,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   47
episode_length                     206.319149
returns                            -99.027589
return_std                          141.65243
average_reward                      -0.483533
round_time             0 days 00:06:32.000365
episodes_test                            25.0
episode_length_test                    392.96
returns_test                       226.874803
return_std_test                    317.904007
average_reward_test                   0.57524
round_time_test        0 days 00:00:10.913311
round_time_total       0 days 00:06:32.001645
loss_total                         543.541447
loss_critic                        731.940245
loss_actor                          -210.0538
memory_size                       132395.2925 

=== epoch 2/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:51,  4.23it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   40
episode_length                        236.425
returns                           -111.237944
return_std                         155.797276
average_reward                       -0.46578
round_time             0 days 00:06:31.763130
episodes_test                            22.0
episode_length_test                454.318182
returns_test                       317.966846
return_std_test                    382.155039
average_reward_test                  0.699174
round_time_test        0 days 00:00:11.015182
round_time_total       0 days 00:06:31.764289
loss_total                         536.094189
loss_critic                        722.773878
loss_actor                        -210.624623
memory_size                         134277.12 

=== epoch 2/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.99it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   47
episode_length                     204.106383
returns                            -100.47924
return_std                         149.829609
average_reward                      -0.493347
round_time             0 days 00:06:33.867271
episodes_test                            28.0
episode_length_test                355.321429
returns_test                       217.171634
return_std_test                    337.689037
average_reward_test                  0.611961
round_time_test        0 days 00:00:10.734658
round_time_total       0 days 00:06:33.868511
loss_total                         535.653445
loss_critic                        722.782881
loss_actor                        -212.864349
memory_size                        136150.811 

=== epoch 2/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   46
episode_length                     209.847826
returns                           -101.902175
return_std                         153.816361
average_reward                      -0.481095
round_time             0 days 00:06:34.653103
episodes_test                            24.0
episode_length_test                    392.25
returns_test                       211.686793
return_std_test                    292.615143
average_reward_test                  0.523014
round_time_test        0 days 00:00:11.009683
round_time_total       0 days 00:06:34.654604
loss_total                         542.182151
loss_critic                         731.30715
loss_actor                        -214.317897
memory_size                        137970.342 

=== epoch 2/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:59,  4.75it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   48
episode_length                     191.979167
returns                            -95.767467
return_std                         142.473087
average_reward                       -0.49502
round_time             0 days 00:06:33.959210
episodes_test                            30.0
episode_length_test                331.966667
returns_test                       199.734468
return_std_test                    307.366724
average_reward_test                  0.599905
round_time_test        0 days 00:00:10.934102
round_time_total       0 days 00:06:33.960340
loss_total                          569.45727
loss_critic                         765.69146
loss_actor                        -215.479545
memory_size                        139752.372 

=== epoch 2/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:24,  5.18it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.05it/s]
episodes                                   53
episode_length                     159.358491
returns                            -80.499482
return_std                         126.611909
average_reward                      -0.506073
round_time             0 days 00:06:36.475995
episodes_test                            28.0
episode_length_test                334.964286
returns_test                       232.174289
return_std_test                    361.513208
average_reward_test                  0.689306
round_time_test        0 days 00:00:10.773089
round_time_total       0 days 00:06:36.477106
loss_total                         594.795831
loss_critic                        797.284526
loss_actor                         -215.15901
memory_size                        141512.893 

=== epoch 2/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.05it/s]
episodes                                   74
episode_length                     126.594595
returns                            -64.804305
return_std                         112.447655
average_reward                      -0.509813
round_time             0 days 00:06:36.689328
episodes_test                            23.0
episode_length_test                429.434783
returns_test                       320.488577
return_std_test                    386.405609
average_reward_test                    0.7381
round_time_test        0 days 00:00:10.897535
round_time_total       0 days 00:06:36.690421
loss_total                         596.956071
loss_critic                        800.060377
loss_actor                         -215.46121
memory_size                        143164.967 

=== epoch 2/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:49,  4.24it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   68
episode_length                     132.779412
returns                            -67.314405
return_std                          116.71787
average_reward                      -0.501119
round_time             0 days 00:06:38.743653
episodes_test                            20.0
episode_length_test                     499.7
returns_test                       326.742773
return_std_test                    341.517092
average_reward_test                  0.652921
round_time_test        0 days 00:00:11.048028
round_time_total       0 days 00:06:38.744747
loss_total                         578.394154
loss_critic                        777.180651
loss_actor                        -216.751885
memory_size                       144960.8595 

=== epoch 2/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.06it/s]
episodes                                   73
episode_length                     125.054795
returns                            -66.732733
return_std                         110.882733
average_reward                      -0.522026
round_time             0 days 00:06:36.150043
episodes_test                            29.0
episode_length_test                 330.37931
returns_test                       201.601997
return_std_test                    312.654636
average_reward_test                  0.615182
round_time_test        0 days 00:00:10.711348
round_time_total       0 days 00:06:36.151141
loss_total                          585.41105
loss_critic                        786.132405
loss_actor                        -217.474429
memory_size                       146768.5885 

=== epoch 2/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:38,  4.34it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.04it/s]
episodes                                   82
episode_length                     110.390244
returns                            -55.988852
return_std                         101.137242
average_reward                      -0.510944
round_time             0 days 00:06:37.669009
episodes_test                            44.0
episode_length_test                224.909091
returns_test                       142.695478
return_std_test                    299.823109
average_reward_test                  0.636524
round_time_test        0 days 00:00:10.794734
round_time_total       0 days 00:06:37.670111
loss_total                          593.62834
loss_critic                        796.898172
loss_actor                        -219.451043
memory_size                       148446.2335 

=== epoch 2/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 6/2000 [00:01<10:30,  3.16it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.99it/s]
episodes                                   69
episode_length                      131.42029
returns                            -65.936112
return_std                         123.000963
average_reward                      -0.492439
round_time             0 days 00:06:41.703359
episodes_test                            23.0
episode_length_test                413.304348
returns_test                       273.504268
return_std_test                    346.087531
average_reward_test                  0.644426
round_time_test        0 days 00:00:10.910058
round_time_total       0 days 00:06:41.704789
loss_total                         585.698196
loss_critic                        787.281789
loss_actor                         -220.63623
memory_size                        150260.158 

=== epoch 2/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:40,  4.33it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   58
episode_length                     165.758621
returns                            -81.745763
return_std                         139.983452
average_reward                      -0.490166
round_time             0 days 00:06:38.686540
episodes_test                            22.0
episode_length_test                442.590909
returns_test                       288.917149
return_std_test                    348.516669
average_reward_test                  0.644192
round_time_test        0 days 00:00:11.054962
round_time_total       0 days 00:06:38.687877
loss_total                         561.478946
loss_critic                        757.602698
loss_actor                        -223.016112
memory_size                        152169.398 

=== epoch 2/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:30,  4.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   65
episode_length                     137.153846
returns                            -69.033192
return_std                         130.714911
average_reward                      -0.496051
round_time             0 days 00:06:39.349684
episodes_test                            19.0
episode_length_test                476.526316
returns_test                       300.651783
return_std_test                    325.924853
average_reward_test                  0.615208
round_time_test        0 days 00:00:10.899877
round_time_total       0 days 00:06:39.350967
loss_total                         591.176017
loss_critic                        794.888994
loss_actor                        -223.675942
memory_size                        153815.535 

=== epoch 2/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.99it/s]
episodes                                   71
episode_length                     129.521127
returns                            -67.607861
return_std                         128.831138
average_reward                      -0.515329
round_time             0 days 00:06:41.704254
episodes_test                            20.0
episode_length_test                    458.55
returns_test                       326.344645
return_std_test                    380.069982
average_reward_test                  0.703957
round_time_test        0 days 00:00:10.920558
round_time_total       0 days 00:06:41.705359
loss_total                         590.948237
loss_critic                        794.782471
loss_actor                        -224.388754
memory_size                       155610.5155 

=== epoch 2/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:51,  4.84it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  5.00it/s]
episodes                                   64
episode_length                     151.796875
returns                            -78.259719
return_std                         137.736168
average_reward                      -0.512104
round_time             0 days 00:06:40.998162
episodes_test                            19.0
episode_length_test                521.842105
returns_test                       359.072143
return_std_test                    345.877714
average_reward_test                  0.684313
round_time_test        0 days 00:00:10.903884
round_time_total       0 days 00:06:40.999285
loss_total                         587.513032
loss_critic                        790.608015
loss_actor                        -224.866955
memory_size                       157424.9575 

=== epoch 2/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:28,  4.45it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   66
episode_length                     133.393939
returns                            -70.531738
return_std                         127.932203
average_reward                      -0.516858
round_time             0 days 00:06:43.026848
episodes_test                            26.0
episode_length_test                376.115385
returns_test                       232.524149
return_std_test                    321.704444
average_reward_test                  0.612354
round_time_test        0 days 00:00:10.878765
round_time_total       0 days 00:06:43.027976
loss_total                         587.200977
loss_critic                        790.671854
loss_actor                        -226.682587
memory_size                       159156.0665 

=== epoch 2/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:42,  4.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   64
episode_length                     151.953125
returns                            -78.371165
return_std                         134.588948
average_reward                      -0.511838
round_time             0 days 00:06:40.406364
episodes_test                            32.0
episode_length_test                   310.875
returns_test                       194.208121
return_std_test                    312.026832
average_reward_test                  0.622943
round_time_test        0 days 00:00:10.831856
round_time_total       0 days 00:06:40.407463
loss_total                         575.253528
loss_critic                         776.25613
loss_actor                        -228.756933
memory_size                        161009.494 

=== epoch 2/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:20,  4.53it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.95it/s]
episodes                                   56
episode_length                     168.357143
returns                            -84.612967
return_std                          136.37831
average_reward                      -0.499817
round_time             0 days 00:06:44.424779
episodes_test                            25.0
episode_length_test                    384.96
returns_test                       252.122566
return_std_test                    342.385697
average_reward_test                  0.662468
round_time_test        0 days 00:00:11.020696
round_time_total       0 days 00:06:44.426136
loss_total                         582.224963
loss_critic                        785.418693
loss_actor                        -230.550009
memory_size                       162886.5595 

=== epoch 2/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   43
episode_length                     203.906977
returns                            -97.245848
return_std                         152.101391
average_reward                      -0.473691
round_time             0 days 00:06:43.146288
episodes_test                            24.0
episode_length_test                379.916667
returns_test                       276.856139
return_std_test                    375.719014
average_reward_test                  0.729307
round_time_test        0 days 00:00:10.983829
round_time_total       0 days 00:06:43.147404
loss_total                         574.166895
loss_critic                        775.864581
loss_actor                        -232.623906
memory_size                         164768.12 

=== epoch 2/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:18,  4.54it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   33
episode_length                     280.969697
returns                            -135.98866
return_std                         185.550032
average_reward                      -0.486479
round_time             0 days 00:06:43.015064
episodes_test                            22.0
episode_length_test                451.954545
returns_test                       325.956411
return_std_test                    390.541617
average_reward_test                  0.720967
round_time_test        0 days 00:00:10.897590
round_time_total       0 days 00:06:43.016179
loss_total                         570.159878
loss_critic                        771.259509
loss_actor                        -234.238701
memory_size                       166689.3535 

=== epoch 2/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:58,  4.75it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.95it/s]
episodes                                   59
episode_length                     165.949153
returns                            -85.531477
return_std                         147.416829
average_reward                      -0.510308
round_time             0 days 00:06:44.536194
episodes_test                            22.0
episode_length_test                413.727273
returns_test                       294.796861
return_std_test                     360.85286
average_reward_test                  0.717846
round_time_test        0 days 00:00:10.868678
round_time_total       0 days 00:06:44.537290
loss_total                         584.843524
loss_critic                         789.71494
loss_actor                        -234.642198
memory_size                        168417.639 

=== epoch 2/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   69
episode_length                     134.173913
returns                            -67.417777
return_std                         132.940252
average_reward                      -0.497851
round_time             0 days 00:06:44.746807
episodes_test                            38.0
episode_length_test                257.578947
returns_test                       145.232252
return_std_test                     262.97369
average_reward_test                  0.564554
round_time_test        0 days 00:00:10.803372
round_time_total       0 days 00:06:44.748147
loss_total                         599.101826
loss_critic                        807.857994
loss_actor                        -235.922904
memory_size                       170006.3665 

=== epoch 2/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   68
episode_length                     135.367647
returns                            -70.224584
return_std                         138.314632
average_reward                      -0.508244
round_time             0 days 00:06:43.882679
episodes_test                            23.0
episode_length_test                395.130435
returns_test                       282.830016
return_std_test                    362.550071
average_reward_test                  0.727511
round_time_test        0 days 00:00:10.888353
round_time_total       0 days 00:06:43.883930
loss_total                         597.792475
loss_critic                         806.26001
loss_actor                        -236.077722
memory_size                        171712.118 

=== epoch 2/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:35,  4.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   71
episode_length                      130.15493
returns                            -66.581476
return_std                         133.409587
average_reward                      -0.505085
round_time             0 days 00:06:43.719716
episodes_test                            32.0
episode_length_test                 295.96875
returns_test                       169.193029
return_std_test                    255.202075
average_reward_test                  0.577762
round_time_test        0 days 00:00:10.801922
round_time_total       0 days 00:06:43.720885
loss_total                          589.57176
loss_critic                        796.138353
loss_actor                        -236.694663
memory_size                       173579.2775 

=== epoch 2/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:42,  4.95it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   79
episode_length                     122.455696
returns                            -60.344791
return_std                         119.536504
average_reward                      -0.493399
round_time             0 days 00:06:44.838935
episodes_test                            26.0
episode_length_test                372.884615
returns_test                       225.722821
return_std_test                    318.660666
average_reward_test                  0.589932
round_time_test        0 days 00:00:10.769277
round_time_total       0 days 00:06:44.840039
loss_total                         578.925907
loss_critic                        782.963654
loss_actor                        -237.225136
memory_size                        175460.769 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 3/10 ===== round 1/50 ======================================
  1%|          | 12/2000 [00:02<06:22,  5.20it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:00<00:00,  5.55it/s]
episodes                                   17
episode_length                     103.529412
returns                            -47.056666
return_std                           83.79355
average_reward                      -0.446085
round_time             0 days 00:06:00.305175
episodes_test                            44.0
episode_length_test                222.340909
returns_test                       129.089651
return_std_test                    260.922971
average_reward_test                  0.572444
round_time_test        0 days 00:00:10.809813
round_time_total       0 days 00:06:00.306290
loss_total                         591.742658
loss_critic                        799.269708
loss_actor                        -238.365602
memory_size                       177120.7695 

=== epoch 3/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:27,  5.14it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:01<00:00,  5.53it/s]
episodes                                   23
episode_length                          134.0
returns                            -58.707968
return_std                         112.028117
average_reward                      -0.434374
round_time             0 days 00:06:02.378811
episodes_test                            41.0
episode_length_test                232.414634
returns_test                        126.33603
return_std_test                    239.621939
average_reward_test                  0.556774
round_time_test        0 days 00:00:10.801908
round_time_total       0 days 00:06:02.379902
loss_total                         595.941781
loss_critic                        804.585384
loss_actor                        -238.632689
memory_size                       178963.0975 

=== epoch 3/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:17,  5.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.51it/s]
episodes                                   45
episode_length                     111.377778
returns                            -52.103863
return_std                          99.602162
average_reward                      -0.449597
round_time             0 days 00:06:03.781348
episodes_test                            48.0
episode_length_test                  193.3125
returns_test                       122.195924
return_std_test                    268.044204
average_reward_test                  0.642745
round_time_test        0 days 00:00:10.858118
round_time_total       0 days 00:06:03.782448
loss_total                           603.8035
loss_critic                        814.449791
loss_actor                        -238.781714
memory_size                       180663.7445 

=== epoch 3/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:17,  5.28it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:07<00:00,  5.45it/s]
episodes                                   53
episode_length                     136.056604
returns                            -59.346843
return_std                         108.940153
average_reward                      -0.427637
round_time             0 days 00:06:07.740283
episodes_test                            27.0
episode_length_test                334.666667
returns_test                       185.625944
return_std_test                    284.222315
average_reward_test                  0.572029
round_time_test        0 days 00:00:10.819490
round_time_total       0 days 00:06:07.741406
loss_total                         593.474607
loss_critic                        802.043417
loss_actor                        -240.800689
memory_size                       182480.5995 

=== epoch 3/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:55,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   62
episode_length                     153.241935
returns                            -67.858337
return_std                         119.925914
average_reward                      -0.438599
round_time             0 days 00:06:10.766088
episodes_test                            33.0
episode_length_test                296.242424
returns_test                       181.878643
return_std_test                    309.186834
average_reward_test                  0.614064
round_time_test        0 days 00:00:10.678391
round_time_total       0 days 00:06:10.767185
loss_total                         594.334665
loss_critic                        803.537634
loss_actor                        -242.477273
memory_size                        184313.243 

=== epoch 3/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:31,  5.08it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   61
episode_length                     151.393443
returns                            -67.561341
return_std                          120.71969
average_reward                      -0.448995
round_time             0 days 00:06:12.717617
episodes_test                            33.0
episode_length_test                301.090909
returns_test                       220.629167
return_std_test                    343.843376
average_reward_test                  0.732632
round_time_test        0 days 00:00:10.950952
round_time_total       0 days 00:06:12.718694
loss_total                         603.259947
loss_critic                        815.044367
loss_actor                        -243.877793
memory_size                       186205.1645 

=== epoch 3/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:07,  5.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   69
episode_length                     141.202899
returns                            -67.743845
return_std                         120.222724
average_reward                       -0.47271
round_time             0 days 00:06:12.809301
episodes_test                            21.0
episode_length_test                434.952381
returns_test                       301.400763
return_std_test                    339.557322
average_reward_test                  0.707138
round_time_test        0 days 00:00:10.757374
round_time_total       0 days 00:06:12.810410
loss_total                         616.069127
loss_critic                        831.103195
loss_actor                        -244.067196
memory_size                       187872.7525 

=== epoch 3/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 10/2000 [00:02<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.32it/s]
episodes                                   50
episode_length                         196.46
returns                            -91.121232
return_std                          146.63729
average_reward                      -0.469456
round_time             0 days 00:06:16.791835
episodes_test                            30.0
episode_length_test                331.366667
returns_test                       216.509357
return_std_test                    327.250628
average_reward_test                   0.64923
round_time_test        0 days 00:00:10.864500
round_time_total       0 days 00:06:16.793402
loss_total                         607.762798
loss_critic                        820.936045
loss_actor                        -244.930251
memory_size                       189760.8305 

=== epoch 3/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:33,  4.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:15<00:00,  5.32it/s]
episodes                                   47
episode_length                     187.340426
returns                            -91.471503
return_std                         153.394238
average_reward                      -0.483826
round_time             0 days 00:06:16.397998
episodes_test                            29.0
episode_length_test                 331.62069
returns_test                       208.421585
return_std_test                    327.747623
average_reward_test                  0.621522
round_time_test        0 days 00:00:10.863825
round_time_total       0 days 00:06:16.399097
loss_total                         606.759377
loss_critic                         820.10121
loss_actor                        -246.608015
memory_size                       191686.3575 

=== epoch 3/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:22,  5.21it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.30it/s]
episodes                                   50
episode_length                         185.56
returns                            -90.500162
return_std                          151.89101
average_reward                      -0.491693
round_time             0 days 00:06:17.745875
episodes_test                            26.0
episode_length_test                381.346154
returns_test                       262.813043
return_std_test                     359.81715
average_reward_test                  0.683258
round_time_test        0 days 00:00:11.023098
round_time_total       0 days 00:06:17.746958
loss_total                         607.578651
loss_critic                        821.344868
loss_actor                        -247.486277
memory_size                        193518.101 

=== epoch 3/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.99it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:19<00:00,  5.26it/s]
episodes                                   51
episode_length                      186.72549
returns                            -91.585196
return_std                          155.10871
average_reward                      -0.488868
round_time             0 days 00:06:20.494042
episodes_test                            25.0
episode_length_test                     385.0
returns_test                       268.784053
return_std_test                    354.758724
average_reward_test                  0.698019
round_time_test        0 days 00:00:10.865472
round_time_total       0 days 00:06:20.495351
loss_total                         604.956395
loss_critic                        818.291436
loss_actor                        -248.383827
memory_size                        195315.653 

=== epoch 3/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:25,  5.16it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.26it/s]
episodes                                   42
episode_length                          209.5
returns                            -98.820718
return_std                         164.247495
average_reward                      -0.469088
round_time             0 days 00:06:20.851497
episodes_test                            30.0
episode_length_test                316.233333
returns_test                       170.760019
return_std_test                    281.179571
average_reward_test                  0.551298
round_time_test        0 days 00:00:10.767108
round_time_total       0 days 00:06:20.852575
loss_total                         605.375788
loss_critic                        819.066994
loss_actor                        -249.389091
memory_size                       197096.9715 

=== epoch 3/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:40,  4.98it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:19<00:00,  5.27it/s]
episodes                                   43
episode_length                     208.162791
returns                            -95.481357
return_std                         154.092023
average_reward                      -0.455233
round_time             0 days 00:06:20.458322
episodes_test                            24.0
episode_length_test                405.041667
returns_test                       270.213648
return_std_test                    330.755134
average_reward_test                  0.665553
round_time_test        0 days 00:00:10.746458
round_time_total       0 days 00:06:20.459467
loss_total                         601.027586
loss_critic                        813.949949
loss_actor                        -250.661922
memory_size                       199036.8505 

=== epoch 3/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:35,  4.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.24it/s]
episodes                                   45
episode_length                     200.933333
returns                            -92.035361
return_std                         146.712373
average_reward                      -0.456529
round_time             0 days 00:06:22.628721
episodes_test                            29.0
episode_length_test                338.896552
returns_test                       231.358985
return_std_test                     354.93606
average_reward_test                  0.675775
round_time_test        0 days 00:00:10.992425
round_time_total       0 days 00:06:22.629857
loss_total                         597.753148
loss_critic                        810.248053
loss_actor                        -252.226529
memory_size                       200932.9915 

=== epoch 3/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.85it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   53
episode_length                     170.264151
returns                            -75.895073
return_std                         132.526589
average_reward                      -0.445425
round_time             0 days 00:06:25.373406
episodes_test                            29.0
episode_length_test                336.137931
returns_test                       177.174692
return_std_test                    267.914042
average_reward_test                  0.524393
round_time_test        0 days 00:00:10.746763
round_time_total       0 days 00:06:25.374693
loss_total                         600.683025
loss_critic                        814.039544
loss_actor                        -252.743107
memory_size                        202662.889 

=== epoch 3/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:13,  4.60it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   43
episode_length                     201.837209
returns                             -89.58065
return_std                         145.099182
average_reward                      -0.436599
round_time             0 days 00:06:24.961951
episodes_test                            28.0
episode_length_test                354.142857
returns_test                       241.594228
return_std_test                    351.518339
average_reward_test                  0.675555
round_time_test        0 days 00:00:10.934292
round_time_total       0 days 00:06:24.963057
loss_total                         593.915432
loss_critic                        805.640573
loss_actor                        -252.985192
memory_size                        204481.162 

=== epoch 3/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:56,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   53
episode_length                     172.981132
returns                            -79.756636
return_std                         136.690111
average_reward                      -0.451793
round_time             0 days 00:06:22.373461
episodes_test                            31.0
episode_length_test                320.870968
returns_test                       227.445646
return_std_test                    333.122384
average_reward_test                  0.707897
round_time_test        0 days 00:00:10.882907
round_time_total       0 days 00:06:22.374586
loss_total                         600.059616
loss_critic                         813.60923
loss_actor                        -254.138896
memory_size                       206237.6845 

=== epoch 3/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   53
episode_length                     171.716981
returns                             -78.52258
return_std                         137.704685
average_reward                      -0.450247
round_time             0 days 00:06:24.973758
episodes_test                            19.0
episode_length_test                474.105263
returns_test                       314.111709
return_std_test                    366.614216
average_reward_test                  0.673701
round_time_test        0 days 00:00:10.903070
round_time_total       0 days 00:06:24.975236
loss_total                         603.792194
loss_critic                        818.449968
loss_actor                         -254.83896
memory_size                       208130.4945 

=== epoch 3/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:19,  4.54it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.18it/s]
episodes                                   69
episode_length                     142.782609
returns                            -68.038553
return_std                         121.876697
average_reward                      -0.477813
round_time             0 days 00:06:26.422740
episodes_test                            23.0
episode_length_test                408.826087
returns_test                       192.019442
return_std_test                    261.879968
average_reward_test                  0.485776
round_time_test        0 days 00:00:11.018294
round_time_total       0 days 00:06:26.424101
loss_total                          595.99072
loss_critic                        809.346284
loss_actor                        -257.431594
memory_size                       209974.3875 

=== epoch 3/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:29,  4.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   86
episode_length                     106.744186
returns                            -53.312798
return_std                          98.610194
average_reward                      -0.494879
round_time             0 days 00:06:28.489736
episodes_test                            29.0
episode_length_test                316.862069
returns_test                       186.127442
return_std_test                    299.648318
average_reward_test                  0.577499
round_time_test        0 days 00:00:10.897089
round_time_total       0 days 00:06:28.490972
loss_total                         635.637161
loss_critic                        858.552309
loss_actor                        -256.023489
memory_size                        211376.967 

=== epoch 3/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:42,  4.95it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   81
episode_length                     109.691358
returns                            -54.564223
return_std                         104.047708
average_reward                      -0.487462
round_time             0 days 00:06:27.910886
episodes_test                            23.0
episode_length_test                425.434783
returns_test                       275.640332
return_std_test                    355.022366
average_reward_test                  0.643437
round_time_test        0 days 00:00:10.859390
round_time_total       0 days 00:06:27.912010
loss_total                         641.330267
loss_critic                        865.551716
loss_actor                        -255.555593
memory_size                        213064.095 

=== epoch 3/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.14it/s]
episodes                                   85
episode_length                     107.411765
returns                            -50.857258
return_std                         101.134427
average_reward                      -0.470372
round_time             0 days 00:06:29.412222
episodes_test                            26.0
episode_length_test                382.692308
returns_test                       224.910707
return_std_test                    312.903597
average_reward_test                  0.586143
round_time_test        0 days 00:00:11.039580
round_time_total       0 days 00:06:29.413329
loss_total                         645.371677
loss_critic                        870.565948
loss_actor                        -255.405468
memory_size                       214924.9435 

=== epoch 3/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   97
episode_length                      92.123711
returns                            -47.687578
return_std                          92.647428
average_reward                      -0.506164
round_time             0 days 00:06:30.970931
episodes_test                            28.0
episode_length_test                355.678571
returns_test                        223.36678
return_std_test                    305.079952
average_reward_test                  0.626506
round_time_test        0 days 00:00:10.903347
round_time_total       0 days 00:06:30.972426
loss_total                         648.209293
loss_critic                          874.0835
loss_actor                        -255.287595
memory_size                       216620.7755 

=== epoch 3/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:46,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.13it/s]
episodes                                   83
episode_length                     109.253012
returns                            -52.698001
return_std                          106.89982
average_reward                      -0.486079
round_time             0 days 00:06:30.758297
episodes_test                            31.0
episode_length_test                311.935484
returns_test                       175.196344
return_std_test                    307.926094
average_reward_test                  0.551955
round_time_test        0 days 00:00:10.819906
round_time_total       0 days 00:06:30.759506
loss_total                         652.732443
loss_critic                        879.800612
loss_actor                        -255.540293
memory_size                        218398.608 

=== epoch 3/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   67
episode_length                     134.731343
returns                            -64.725675
return_std                          123.70668
average_reward                      -0.481487
round_time             0 days 00:06:29.620200
episodes_test                            20.0
episode_length_test                     479.8
returns_test                       330.742053
return_std_test                    362.539297
average_reward_test                  0.673073
round_time_test        0 days 00:00:11.004082
round_time_total       0 days 00:06:29.621428
loss_total                         652.981285
loss_critic                        879.994818
loss_actor                        -255.072905
memory_size                        220092.688 

=== epoch 3/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.98it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   78
episode_length                     124.025641
returns                            -63.524326
return_std                          125.56563
average_reward                      -0.510887
round_time             0 days 00:06:33.428757
episodes_test                            24.0
episode_length_test                404.208333
returns_test                       249.543167
return_std_test                    333.502978
average_reward_test                  0.617849
round_time_test        0 days 00:00:10.784758
round_time_total       0 days 00:06:33.429893
loss_total                         655.273137
loss_critic                        882.971351
loss_actor                         -255.51978
memory_size                       221872.8105 

=== epoch 3/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:19,  4.53it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.09it/s]
episodes                                   66
episode_length                     137.454545
returns                            -72.015057
return_std                         134.202623
average_reward                      -0.510559
round_time             0 days 00:06:33.743203
episodes_test                            23.0
episode_length_test                395.347826
returns_test                       220.963658
return_std_test                    315.950391
average_reward_test                  0.580724
round_time_test        0 days 00:00:10.663037
round_time_total       0 days 00:06:33.744321
loss_total                         655.848073
loss_critic                        883.466156
loss_actor                        -254.624315
memory_size                        223694.492 

=== epoch 3/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:29,  4.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   66
episode_length                      138.69697
returns                            -65.433339
return_std                         133.089984
average_reward                      -0.469819
round_time             0 days 00:06:34.163460
episodes_test                            31.0
episode_length_test                316.129032
returns_test                       206.337619
return_std_test                    328.130311
average_reward_test                  0.645492
round_time_test        0 days 00:00:10.898479
round_time_total       0 days 00:06:34.164569
loss_total                         666.627971
loss_critic                        896.991969
loss_actor                        -254.828084
memory_size                       225457.2845 

=== epoch 3/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:53,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   82
episode_length                      112.47561
returns                            -54.541106
return_std                         112.540774
average_reward                      -0.481066
round_time             0 days 00:06:34.117296
episodes_test                            29.0
episode_length_test                344.275862
returns_test                        217.00623
return_std_test                    324.909155
average_reward_test                  0.629261
round_time_test        0 days 00:00:10.962917
round_time_total       0 days 00:06:34.118600
loss_total                         677.212006
loss_critic                        910.024356
loss_actor                        -254.037451
memory_size                        227089.964 

=== epoch 3/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   73
episode_length                     135.479452
returns                            -61.788917
return_std                         126.198517
average_reward                      -0.456485
round_time             0 days 00:06:37.483460
episodes_test                            31.0
episode_length_test                316.677419
returns_test                       200.160336
return_std_test                    325.422971
average_reward_test                   0.62458
round_time_test        0 days 00:00:10.851051
round_time_total       0 days 00:06:37.484557
loss_total                         680.284749
loss_critic                        913.825473
loss_actor                        -253.878212
memory_size                        228907.154 

=== epoch 3/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   79
episode_length                     107.556962
returns                            -48.963221
return_std                           97.03127
average_reward                      -0.452009
round_time             0 days 00:06:39.605545
episodes_test                            21.0
episode_length_test                429.333333
returns_test                       297.374698
return_std_test                    369.591455
average_reward_test                   0.67407
round_time_test        0 days 00:00:11.040815
round_time_total       0 days 00:06:39.607044
loss_total                         697.968233
loss_critic                        935.541977
loss_actor                        -252.326804
memory_size                        230536.746 

=== epoch 3/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:17,  4.56it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   84
episode_length                     117.428571
returns                            -52.759049
return_std                         101.065191
average_reward                      -0.450966
round_time             0 days 00:06:38.445779
episodes_test                            20.0
episode_length_test                    483.45
returns_test                       342.613257
return_std_test                    367.396249
average_reward_test                  0.710472
round_time_test        0 days 00:00:10.639964
round_time_total       0 days 00:06:38.446877
loss_total                         710.589388
loss_critic                        951.039984
loss_actor                        -251.213062
memory_size                        232292.699 

=== epoch 3/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:33,  4.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   71
episode_length                     126.521127
returns                            -59.375257
return_std                         108.823425
average_reward                      -0.465088
round_time             0 days 00:06:37.546051
episodes_test                            27.0
episode_length_test                369.666667
returns_test                       233.358206
return_std_test                    313.540643
average_reward_test                  0.628785
round_time_test        0 days 00:00:10.754843
round_time_total       0 days 00:06:37.547536
loss_total                         690.548283
loss_critic                        926.299939
loss_actor                        -252.458405
memory_size                       234191.1425 

=== epoch 3/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:17,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.01it/s]
episodes                                   54
episode_length                     165.814815
returns                            -72.673003
return_std                         127.902414
average_reward                      -0.443002
round_time             0 days 00:06:39.365167
episodes_test                            31.0
episode_length_test                322.354839
returns_test                       185.459692
return_std_test                    274.272181
average_reward_test                  0.574697
round_time_test        0 days 00:00:10.745260
round_time_total       0 days 00:06:39.366311
loss_total                         684.913802
loss_critic                        919.493562
loss_actor                        -253.405302
memory_size                        236070.922 

=== epoch 3/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.63it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   58
episode_length                     171.413793
returns                             -75.15763
return_std                         127.830944
average_reward                      -0.438217
round_time             0 days 00:06:39.790468
episodes_test                            37.0
episode_length_test                 269.72973
returns_test                       166.539288
return_std_test                    266.013016
average_reward_test                  0.616782
round_time_test        0 days 00:00:10.818371
round_time_total       0 days 00:06:39.791599
loss_total                         685.893146
loss_critic                        920.700891
loss_actor                        -253.337898
memory_size                       237896.7115 

=== epoch 3/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:14,  4.59it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   47
episode_length                     196.680851
returns                            -83.068679
return_std                         142.097764
average_reward                       -0.41914
round_time             0 days 00:06:41.339508
episodes_test                            35.0
episode_length_test                258.514286
returns_test                       146.088922
return_std_test                    232.065437
average_reward_test                  0.564022
round_time_test        0 days 00:00:10.861384
round_time_total       0 days 00:06:41.340627
loss_total                         685.411649
loss_critic                        920.477418
loss_actor                        -254.851489
memory_size                        239667.515 

=== epoch 3/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:03,  4.70it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   62
episode_length                     144.790323
returns                            -60.761628
return_std                         118.685859
average_reward                       -0.42359
round_time             0 days 00:06:38.700969
episodes_test                            33.0
episode_length_test                 302.30303
returns_test                       181.904942
return_std_test                    280.610226
average_reward_test                  0.599565
round_time_test        0 days 00:00:10.956014
round_time_total       0 days 00:06:38.702062
loss_total                         695.050539
loss_critic                        932.553692
loss_actor                        -254.962138
memory_size                        241443.189 

=== epoch 3/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   91
episode_length                     103.725275
returns                            -45.364763
return_std                          93.154427
average_reward                      -0.437372
round_time             0 days 00:06:42.471591
episodes_test                            21.0
episode_length_test                469.380952
returns_test                       319.821371
return_std_test                    358.509707
average_reward_test                  0.677833
round_time_test        0 days 00:00:10.910567
round_time_total       0 days 00:06:42.473043
loss_total                         714.844512
loss_critic                        956.998939
loss_actor                         -253.77326
memory_size                        242858.896 

=== epoch 3/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:15,  4.57it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.95it/s]
episodes                                   95
episode_length                     101.526316
returns                            -45.511865
return_std                          93.333796
average_reward                      -0.449919
round_time             0 days 00:06:44.329644
episodes_test                            29.0
episode_length_test                314.275862
returns_test                       184.259083
return_std_test                    279.249854
average_reward_test                  0.609025
round_time_test        0 days 00:00:10.938771
round_time_total       0 days 00:06:44.330757
loss_total                         712.986093
loss_critic                        954.683054
loss_actor                         -253.80181
memory_size                        244599.743 

=== epoch 3/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   89
episode_length                      104.47191
returns                            -47.959889
return_std                          98.060901
average_reward                      -0.463603
round_time             0 days 00:06:42.604983
episodes_test                            20.0
episode_length_test                    497.95
returns_test                        342.99359
return_std_test                    329.040507
average_reward_test                  0.687913
round_time_test        0 days 00:00:10.650101
round_time_total       0 days 00:06:42.606099
loss_total                          717.15308
loss_critic                        959.711263
loss_actor                        -253.079714
memory_size                       246313.5545 

=== epoch 3/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:24,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   97
episode_length                      88.628866
returns                            -41.880366
return_std                          87.964039
average_reward                        -0.4615
round_time             0 days 00:06:44.052725
episodes_test                            24.0
episode_length_test                376.416667
returns_test                        252.62535
return_std_test                    336.042744
average_reward_test                  0.680317
round_time_test        0 days 00:00:10.873826
round_time_total       0 days 00:06:44.053824
loss_total                         720.864503
loss_critic                        964.524469
loss_actor                        -253.775424
memory_size                        248096.354 

=== epoch 3/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:52,  4.84it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   87
episode_length                     113.114943
returns                            -51.649551
return_std                         106.553572
average_reward                      -0.458643
round_time             0 days 00:06:43.726584
episodes_test                            38.0
episode_length_test                243.815789
returns_test                       151.093105
return_std_test                    252.151638
average_reward_test                  0.637879
round_time_test        0 days 00:00:10.770389
round_time_total       0 days 00:06:43.727731
loss_total                         726.574051
loss_critic                        971.418821
loss_actor                        -252.805093
memory_size                        249805.395 

=== epoch 3/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:30,  4.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   62
episode_length                     138.709677
returns                            -61.370771
return_std                         124.619035
average_reward                      -0.438163
round_time             0 days 00:06:43.130780
episodes_test                            26.0
episode_length_test                357.961538
returns_test                       236.597612
return_std_test                    324.338384
average_reward_test                  0.657998
round_time_test        0 days 00:00:10.869351
round_time_total       0 days 00:06:43.131897
loss_total                         716.882125
loss_critic                        959.507389
loss_actor                        -253.618994
memory_size                        251655.003 

=== epoch 3/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:08,  4.08it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   56
episode_length                         163.25
returns                             -69.90673
return_std                         134.504346
average_reward                      -0.427073
round_time             0 days 00:06:45.497663
episodes_test                            24.0
episode_length_test                    409.25
returns_test                        261.68221
return_std_test                    330.701886
average_reward_test                  0.641556
round_time_test        0 days 00:00:11.071521
round_time_total       0 days 00:06:45.498765
loss_total                         706.921283
loss_critic                        947.442502
loss_actor                        -255.163653
memory_size                        253529.958 

=== epoch 3/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:31,  4.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   62
episode_length                     155.887097
returns                            -68.225335
return_std                         127.617985
average_reward                      -0.436839
round_time             0 days 00:06:45.458454
episodes_test                            18.0
episode_length_test                507.444444
returns_test                       326.947485
return_std_test                    342.282285
average_reward_test                   0.64151
round_time_test        0 days 00:00:10.855868
round_time_total       0 days 00:06:45.459779
loss_total                         703.976492
loss_critic                         944.14047
loss_actor                        -256.679485
memory_size                       255370.7815 

=== epoch 3/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   68
episode_length                     145.264706
returns                            -62.560586
return_std                         122.980186
average_reward                      -0.430748
round_time             0 days 00:06:42.506959
episodes_test                            36.0
episode_length_test                251.444444
returns_test                       170.560063
return_std_test                    293.643069
average_reward_test                  0.692865
round_time_test        0 days 00:00:10.937712
round_time_total       0 days 00:06:42.508180
loss_total                          718.33439
loss_critic                        962.201185
loss_actor                        -257.132859
memory_size                       257100.0145 

=== epoch 3/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.89it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   70
episode_length                     117.285714
returns                            -49.542513
return_std                         101.384467
average_reward                      -0.421859
round_time             0 days 00:06:45.455920
episodes_test                            24.0
episode_length_test                401.208333
returns_test                       253.955859
return_std_test                    305.100865
average_reward_test                   0.63927
round_time_test        0 days 00:00:10.702199
round_time_total       0 days 00:06:45.457294
loss_total                         734.807519
loss_critic                        982.598774
loss_actor                        -256.357565
memory_size                       258616.2705 

=== epoch 3/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:46,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.92it/s]
episodes                                   73
episode_length                     128.452055
returns                            -53.455259
return_std                         106.603149
average_reward                      -0.422254
round_time             0 days 00:06:47.269296
episodes_test                            21.0
episode_length_test                454.761905
returns_test                       304.039667
return_std_test                    331.343748
average_reward_test                  0.669122
round_time_test        0 days 00:00:11.019645
round_time_total       0 days 00:06:47.270387
loss_total                         723.043796
loss_critic                        968.005549
loss_actor                        -256.803277
memory_size                        260426.905 

=== epoch 3/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.94it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   73
episode_length                     124.219178
returns                            -54.282279
return_std                         107.711403
average_reward                      -0.438261
round_time             0 days 00:06:44.880475
episodes_test                            17.0
episode_length_test                582.882353
returns_test                       401.836462
return_std_test                    351.538803
average_reward_test                  0.697053
round_time_test        0 days 00:00:11.081221
round_time_total       0 days 00:06:44.881584
loss_total                          733.92251
loss_critic                        981.676284
loss_actor                         -257.09265
memory_size                       262304.4245 

=== epoch 3/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:32,  5.08it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.92it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   69
episode_length                     142.362319
returns                            -60.851012
return_std                          117.39947
average_reward                       -0.42701
round_time             0 days 00:06:47.149343
episodes_test                            23.0
episode_length_test                 402.73913
returns_test                       267.574261
return_std_test                     294.01523
average_reward_test                  0.665866
round_time_test        0 days 00:00:10.796466
round_time_total       0 days 00:06:47.150463
loss_total                         727.876168
loss_critic                        974.057357
loss_actor                        -256.848657
memory_size                       264195.1855 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 4/10 ===== round 1/50 ======================================
  1%|          | 12/2000 [00:02<06:03,  5.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:01<00:00,  5.53it/s]
episodes                                   21
episode_length                       64.47619
returns                            -36.788777
return_std                          30.614919
average_reward                       -0.53001
round_time             0 days 00:06:01.710082
episodes_test                            27.0
episode_length_test                368.851852
returns_test                       264.530731
return_std_test                    349.827721
average_reward_test                  0.712373
round_time_test        0 days 00:00:10.854666
round_time_total       0 days 00:06:01.711251
loss_total                         751.603515
loss_critic                       1003.553034
loss_actor                        -256.194626
memory_size                        265819.264 

=== epoch 4/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.50it/s]
episodes                                   27
episode_length                     135.555556
returns                            -65.893053
return_std                         109.863757
average_reward                      -0.497125
round_time             0 days 00:06:04.249035
episodes_test                            30.0
episode_length_test                312.766667
returns_test                       209.888246
return_std_test                    286.941264
average_reward_test                  0.665001
round_time_test        0 days 00:00:10.857083
round_time_total       0 days 00:06:04.250282
loss_total                         745.207791
loss_critic                        995.464423
loss_actor                        -255.818803
memory_size                       267629.2625 

=== epoch 4/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:46,  4.90it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:07<00:00,  5.44it/s]
episodes                                   41
episode_length                     127.268293
returns                            -62.668849
return_std                         115.430157
average_reward                      -0.479156
round_time             0 days 00:06:08.542505
episodes_test                            32.0
episode_length_test                  292.3125
returns_test                        194.49336
return_std_test                    314.136708
average_reward_test                  0.661382
round_time_test        0 days 00:00:10.871236
round_time_total       0 days 00:06:08.543602
loss_total                          742.27442
loss_critic                        992.047064
loss_actor                        -256.816228
memory_size                       269458.1515 

=== epoch 4/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:17,  5.28it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.39it/s]
episodes                                   65
episode_length                     113.046154
returns                            -53.537747
return_std                         104.078485
average_reward                      -0.465255
round_time             0 days 00:06:11.262737
episodes_test                            22.0
episode_length_test                443.909091
returns_test                       294.275418
return_std_test                    337.081962
average_reward_test                  0.664951
round_time_test        0 days 00:00:10.935058
round_time_total       0 days 00:06:11.263826
loss_total                         748.610426
loss_critic                        999.872182
loss_actor                        -256.436659
memory_size                       271110.0895 

=== epoch 4/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:28,  5.12it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   73
episode_length                     131.671233
returns                            -58.566075
return_std                         111.328789
average_reward                      -0.444124
round_time             0 days 00:06:11.111331
episodes_test                            27.0
episode_length_test                     367.0
returns_test                       263.262805
return_std_test                    344.807922
average_reward_test                  0.717177
round_time_test        0 days 00:00:10.871289
round_time_total       0 days 00:06:11.112598
loss_total                         742.362527
loss_critic                        992.159825
loss_actor                        -256.826733
memory_size                        272862.962 

=== epoch 4/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:39,  4.99it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   57
episode_length                     168.912281
returns                            -72.825896
return_std                          131.96509
average_reward                      -0.431712
round_time             0 days 00:06:13.034326
episodes_test                            45.0
episode_length_test                213.133333
returns_test                        124.12213
return_std_test                    232.142448
average_reward_test                  0.582222
round_time_test        0 days 00:00:10.751962
round_time_total       0 days 00:06:13.035428
loss_total                         742.952489
loss_critic                        993.088577
loss_actor                        -257.591931
memory_size                       274791.8215 

=== epoch 4/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:28,  5.13it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.35it/s]
episodes                                   71
episode_length                     130.267606
returns                            -54.411916
return_std                          102.91608
average_reward                      -0.422291
round_time             0 days 00:06:14.294748
episodes_test                            40.0
episode_length_test                   245.525
returns_test                       159.387687
return_std_test                     275.45406
average_reward_test                  0.638949
round_time_test        0 days 00:00:10.830452
round_time_total       0 days 00:06:14.296126
loss_total                          749.21259
loss_critic                       1000.956403
loss_actor                        -257.762732
memory_size                        276569.083 

=== epoch 4/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:15<00:00,  5.33it/s]
episodes                                   67
episode_length                     136.477612
returns                            -57.027134
return_std                         109.215155
average_reward                       -0.41595
round_time             0 days 00:06:15.740908
episodes_test                            23.0
episode_length_test                410.086957
returns_test                       242.255513
return_std_test                    288.326307
average_reward_test                  0.597975
round_time_test        0 days 00:00:10.846736
round_time_total       0 days 00:06:15.742024
loss_total                         757.028332
loss_critic                       1010.882397
loss_actor                        -258.388001
memory_size                        278307.897 

=== epoch 4/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:32,  5.07it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   56
episode_length                     155.785714
returns                             -68.97299
return_std                         118.726884
average_reward                        -0.4304
round_time             0 days 00:06:26.225002
episodes_test                            22.0
episode_length_test                     417.5
returns_test                       260.729098
return_std_test                     314.28114
average_reward_test                  0.641777
round_time_test        0 days 00:00:10.821000
round_time_total       0 days 00:06:26.226211
loss_total                         756.268848
loss_critic                       1009.883355
loss_actor                        -258.189253
memory_size                        280071.488 

=== epoch 4/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:24,  5.19it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.21it/s]
episodes                                   62
episode_length                     144.483871
returns                            -64.152162
return_std                         109.331992
average_reward                       -0.44559
round_time             0 days 00:06:24.276337
episodes_test                            21.0
episode_length_test                446.952381
returns_test                       322.177565
return_std_test                      358.1623
average_reward_test                   0.69837
round_time_test        0 days 00:00:10.938195
round_time_total       0 days 00:06:24.277557
loss_total                         760.589335
loss_critic                       1015.143739
loss_actor                        -257.628349
memory_size                        281821.891 

=== epoch 4/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:44,  4.92it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   66
episode_length                     151.272727
returns                             -67.00398
return_std                         119.543195
average_reward                      -0.443373
round_time             0 days 00:06:45.361479
episodes_test                            36.0
episode_length_test                    269.25
returns_test                       173.046851
return_std_test                    270.712261
average_reward_test                  0.626793
round_time_test        0 days 00:00:10.831629
round_time_total       0 days 00:06:45.362582
loss_total                         755.955981
loss_critic                        1009.59176
loss_actor                        -258.587201
memory_size                       283681.5545 

=== epoch 4/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:31,  4.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   50
episode_length                          163.3
returns                             -72.17171
return_std                         124.691901
average_reward                      -0.435694
round_time             0 days 00:06:22.115411
episodes_test                            20.0
episode_length_test                    498.65
returns_test                       319.243731
return_std_test                    322.561865
average_reward_test                  0.639735
round_time_test        0 days 00:00:10.889349
round_time_total       0 days 00:06:22.116566
loss_total                         756.516674
loss_critic                       1010.486891
loss_actor                        -259.364265
memory_size                       285529.5865 

=== epoch 4/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.25it/s]
episodes                                   46
episode_length                     195.543478
returns                            -84.491369
return_std                         136.489328
average_reward                      -0.429165
round_time             0 days 00:06:21.591526
episodes_test                            27.0
episode_length_test                 365.37037
returns_test                       262.622751
return_std_test                    325.012821
average_reward_test                   0.71723
round_time_test        0 days 00:00:10.984926
round_time_total       0 days 00:06:21.592679
loss_total                         743.585223
loss_critic                        994.486634
loss_actor                        -260.020487
memory_size                        287452.587 

=== epoch 4/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.94it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.21it/s]
episodes                                   44
episode_length                     221.409091
returns                            -94.643502
return_std                         149.722156
average_reward                      -0.431172
round_time             0 days 00:06:24.694440
episodes_test                            20.0
episode_length_test                    456.85
returns_test                       282.666593
return_std_test                    326.215713
average_reward_test                  0.617837
round_time_test        0 days 00:00:11.080255
round_time_total       0 days 00:06:24.695758
loss_total                           736.9445
loss_critic                        986.501889
loss_actor                        -261.285118
memory_size                       289301.8645 

=== epoch 4/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:13,  5.33it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   37
episode_length                      254.27027
returns                           -110.972636
return_std                         165.652018
average_reward                      -0.435769
round_time             0 days 00:06:25.656240
episodes_test                            34.0
episode_length_test                264.735294
returns_test                       167.050018
return_std_test                    276.943238
average_reward_test                  0.653622
round_time_test        0 days 00:00:10.872310
round_time_total       0 days 00:06:25.657326
loss_total                         741.813802
loss_critic                         992.71803
loss_actor                        -261.803174
memory_size                       291183.3815 

=== epoch 4/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:03,  4.70it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   31
episode_length                     291.870968
returns                           -124.037711
return_std                         172.718032
average_reward                      -0.428985
round_time             0 days 00:06:26.076164
episodes_test                            22.0
episode_length_test                432.136364
returns_test                        294.64155
return_std_test                    321.197631
average_reward_test                  0.691061
round_time_test        0 days 00:00:10.816362
round_time_total       0 days 00:06:26.077251
loss_total                         732.687649
loss_critic                        981.576257
loss_actor                        -262.866847
memory_size                        293055.383 

=== epoch 4/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:03,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.14it/s]
episodes                                   40
episode_length                          246.1
returns                            -111.34717
return_std                         160.368817
average_reward                      -0.450235
round_time             0 days 00:06:29.398005
episodes_test                            35.0
episode_length_test                260.057143
returns_test                       154.938532
return_std_test                    246.868278
average_reward_test                  0.615635
round_time_test        0 days 00:00:10.833311
round_time_total       0 days 00:06:29.399110
loss_total                         739.040457
loss_critic                        989.593291
loss_actor                        -263.170946
memory_size                       294905.8135 

=== epoch 4/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:56,  4.78it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.18it/s]
episodes                                   58
episode_length                     164.810345
returns                            -75.451982
return_std                         133.957217
average_reward                       -0.45494
round_time             0 days 00:06:26.381424
episodes_test                            35.0
episode_length_test                268.342857
returns_test                       178.430762
return_std_test                    284.456836
average_reward_test                  0.675337
round_time_test        0 days 00:00:10.802637
round_time_total       0 days 00:06:26.382881
loss_total                         760.096281
loss_critic                       1015.872549
loss_actor                        -263.008861
memory_size                       296706.1405 

=== epoch 4/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   65
episode_length                     138.215385
returns                            -59.891883
return_std                          116.76001
average_reward                      -0.441654
round_time             0 days 00:06:28.574453
episodes_test                            18.0
episode_length_test                547.222222
returns_test                       351.159892
return_std_test                    297.071213
average_reward_test                  0.639665
round_time_test        0 days 00:00:10.896495
round_time_total       0 days 00:06:28.575544
loss_total                         758.192341
loss_critic                       1013.504438
loss_actor                        -263.056114
memory_size                       298198.6905 

=== epoch 4/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:34,  4.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   72
episode_length                     130.041667
returns                            -55.432506
return_std                         112.756696
average_reward                      -0.435956
round_time             0 days 00:06:27.324150
episodes_test                            32.0
episode_length_test                 306.03125
returns_test                       222.102553
return_std_test                    315.901339
average_reward_test                  0.716954
round_time_test        0 days 00:00:10.718650
round_time_total       0 days 00:06:27.325457
loss_total                         757.133988
loss_critic                       1012.235869
loss_actor                        -263.273607
memory_size                       300003.2595 

=== epoch 4/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.94it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   82
episode_length                     120.463415
returns                            -54.169679
return_std                         110.487184
average_reward                      -0.448392
round_time             0 days 00:06:28.071275
episodes_test                            21.0
episode_length_test                441.666667
returns_test                        309.12811
return_std_test                    341.551337
average_reward_test                  0.696416
round_time_test        0 days 00:00:10.869909
round_time_total       0 days 00:06:28.072524
loss_total                          758.40693
loss_critic                       1013.700745
loss_actor                        -262.768406
memory_size                       301818.4095 

=== epoch 4/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:15,  4.57it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   72
episode_length                     137.277778
returns                            -57.992596
return_std                         121.307436
average_reward                      -0.424951
round_time             0 days 00:06:29.756631
episodes_test                            21.0
episode_length_test                 474.52381
returns_test                       291.123499
return_std_test                     305.41351
average_reward_test                   0.61261
round_time_test        0 days 00:00:10.776717
round_time_total       0 days 00:06:29.757923
loss_total                         750.019119
loss_critic                       1003.484786
loss_actor                        -263.843611
memory_size                       303642.3215 

=== epoch 4/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:14,  4.59it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   59
episode_length                     163.118644
returns                            -69.198674
return_std                         131.163775
average_reward                      -0.425158
round_time             0 days 00:06:33.315800
episodes_test                            15.0
episode_length_test                     642.6
returns_test                       417.063665
return_std_test                    296.103776
average_reward_test                  0.658058
round_time_test        0 days 00:00:10.875353
round_time_total       0 days 00:06:33.316919
loss_total                         751.913356
loss_critic                       1006.208504
loss_actor                        -265.267302
memory_size                       305502.7035 

=== epoch 4/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   54
episode_length                     174.703704
returns                            -74.578927
return_std                         135.717104
average_reward                      -0.424311
round_time             0 days 00:06:30.943820
episodes_test                            19.0
episode_length_test                489.842105
returns_test                       351.365693
return_std_test                    342.640595
average_reward_test                  0.729909
round_time_test        0 days 00:00:11.165698
round_time_total       0 days 00:06:30.944926
loss_total                         759.888863
loss_critic                        1016.08549
loss_actor                         -264.89772
memory_size                       307315.5985 

=== epoch 4/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.10it/s]
episodes                                   60
episode_length                     146.883333
returns                            -63.239355
return_std                         117.844897
average_reward                      -0.429813
round_time             0 days 00:06:33.055307
episodes_test                            28.0
episode_length_test                332.392857
returns_test                       211.712998
return_std_test                    268.105974
average_reward_test                  0.625508
round_time_test        0 days 00:00:10.840232
round_time_total       0 days 00:06:33.056446
loss_total                          758.02527
loss_critic                       1013.641265
loss_actor                        -264.438781
memory_size                       309021.1755 

=== epoch 4/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:38,  5.00it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   62
episode_length                     144.741935
returns                             -60.18887
return_std                         119.836708
average_reward                      -0.412909
round_time             0 days 00:06:33.463880
episodes_test                            32.0
episode_length_test                  311.5625
returns_test                        208.16796
return_std_test                    290.766024
average_reward_test                  0.666902
round_time_test        0 days 00:00:10.721526
round_time_total       0 days 00:06:33.465065
loss_total                         781.786121
loss_critic                       1043.206421
loss_actor                        -263.895152
memory_size                        310756.571 

=== epoch 4/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:58,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.06it/s]
episodes                                   70
episode_length                     135.857143
returns                            -59.113518
return_std                         115.251328
average_reward                      -0.437731
round_time             0 days 00:06:35.552382
episodes_test                            38.0
episode_length_test                243.947368
returns_test                       161.784535
return_std_test                    257.982634
average_reward_test                  0.679912
round_time_test        0 days 00:00:10.890903
round_time_total       0 days 00:06:35.553466
loss_total                          771.42528
loss_critic                       1030.169094
loss_actor                        -263.550048
memory_size                        312465.216 

=== epoch 4/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   91
episode_length                     101.989011
returns                            -46.324119
return_std                          98.956763
average_reward                      -0.449822
round_time             0 days 00:06:37.026319
episodes_test                            35.0
episode_length_test                281.257143
returns_test                       185.855595
return_std_test                    281.451392
average_reward_test                  0.656793
round_time_test        0 days 00:00:10.902814
round_time_total       0 days 00:06:37.027544
loss_total                          782.24965
loss_critic                        1043.57563
loss_actor                        -263.054339
memory_size                        314233.168 

=== epoch 4/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:48,  4.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   82
episode_length                     105.085366
returns                             -46.80449
return_std                         102.919213
average_reward                      -0.439839
round_time             0 days 00:06:38.180802
episodes_test                            27.0
episode_length_test                     368.0
returns_test                       254.542038
return_std_test                    346.944103
average_reward_test                   0.68433
round_time_test        0 days 00:00:10.907908
round_time_total       0 days 00:06:38.181903
loss_total                         781.061131
loss_critic                       1042.003977
loss_actor                        -262.710318
memory_size                        315860.301 

=== epoch 4/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:09,  4.07it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   94
episode_length                     102.234043
returns                            -45.329372
return_std                          93.590997
average_reward                      -0.442357
round_time             0 days 00:06:39.749001
episodes_test                            26.0
episode_length_test                362.576923
returns_test                       199.714772
return_std_test                    255.859025
average_reward_test                  0.541898
round_time_test        0 days 00:00:10.899642
round_time_total       0 days 00:06:39.750146
loss_total                         791.466417
loss_critic                        1055.02876
loss_actor                        -262.783022
memory_size                        317544.167 

=== epoch 4/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:04,  5.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   85
episode_length                     117.494118
returns                            -52.119339
return_std                         104.216438
average_reward                      -0.443938
round_time             0 days 00:06:37.915157
episodes_test                            28.0
episode_length_test                326.678571
returns_test                        232.22856
return_std_test                    305.546777
average_reward_test                  0.724447
round_time_test        0 days 00:00:10.803224
round_time_total       0 days 00:06:37.916343
loss_total                         796.074962
loss_critic                       1060.595497
loss_actor                        -262.007253
memory_size                       319238.5885 

=== epoch 4/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   85
episode_length                     103.364706
returns                            -43.080289
return_std                           89.80522
average_reward                      -0.415684
round_time             0 days 00:06:40.103387
episodes_test                            24.0
episode_length_test                388.958333
returns_test                       250.004665
return_std_test                    325.883846
average_reward_test                  0.635274
round_time_test        0 days 00:00:11.036117
round_time_total       0 days 00:06:40.104470
loss_total                         788.647542
loss_critic                       1051.399203
loss_actor                        -262.359171
memory_size                         321043.65 

=== epoch 4/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:21,  4.51it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   64
episode_length                     136.359375
returns                            -54.323841
return_std                          108.99406
average_reward                      -0.404559
round_time             0 days 00:06:42.863775
episodes_test                            23.0
episode_length_test                429.043478
returns_test                       231.560215
return_std_test                    265.271384
average_reward_test                  0.538384
round_time_test        0 days 00:00:10.818793
round_time_total       0 days 00:06:42.864981
loss_total                          796.42358
loss_critic                        1060.98605
loss_actor                        -261.826374
memory_size                       322914.0265 

=== epoch 4/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  5.00it/s]
episodes                                   68
episode_length                         144.75
returns                            -60.590622
return_std                         122.042368
average_reward                      -0.415122
round_time             0 days 00:06:40.664886
episodes_test                            19.0
episode_length_test                517.526316
returns_test                       347.199741
return_std_test                    319.813182
average_reward_test                  0.668627
round_time_test        0 days 00:00:10.929802
round_time_total       0 days 00:06:40.665987
loss_total                         787.590524
loss_critic                       1050.301434
loss_actor                        -263.253187
memory_size                        324743.309 

=== epoch 4/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:38,  5.00it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   42
episode_length                          199.0
returns                             -80.20046
return_std                         152.478324
average_reward                      -0.398464
round_time             0 days 00:06:41.767229
episodes_test                            28.0
episode_length_test                328.535714
returns_test                       198.891018
return_std_test                    253.782695
average_reward_test                  0.617507
round_time_test        0 days 00:00:10.936778
round_time_total       0 days 00:06:41.768477
loss_total                         790.055151
loss_critic                       1053.327397
loss_actor                        -263.033904
memory_size                       326574.4125 

=== epoch 4/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:18,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   54
episode_length                     184.685185
returns                            -75.585555
return_std                         143.533211
average_reward                      -0.408178
round_time             0 days 00:06:44.057295
episodes_test                            29.0
episode_length_test                333.034483
returns_test                       220.413184
return_std_test                    294.045207
average_reward_test                  0.655068
round_time_test        0 days 00:00:10.898126
round_time_total       0 days 00:06:44.058624
loss_total                         784.845955
loss_critic                       1046.974258
loss_actor                        -263.667331
memory_size                       328431.0615 

=== epoch 4/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:53,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   51
episode_length                     179.078431
returns                            -74.337005
return_std                         144.117986
average_reward                      -0.410756
round_time             0 days 00:06:43.097780
episodes_test                            20.0
episode_length_test                    459.95
returns_test                       360.382498
return_std_test                    386.577078
average_reward_test                  0.790131
round_time_test        0 days 00:00:10.937827
round_time_total       0 days 00:06:43.098901
loss_total                         785.661085
loss_critic                       1048.094878
loss_actor                        -264.074161
memory_size                       330190.3065 

=== epoch 4/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.99it/s]
episodes                                   55
episode_length                     153.345455
returns                            -61.784724
return_std                         125.534523
average_reward                      -0.409693
round_time             0 days 00:06:41.684064
episodes_test                            19.0
episode_length_test                516.210526
returns_test                       331.260551
return_std_test                    334.341724
average_reward_test                  0.636081
round_time_test        0 days 00:00:10.909104
round_time_total       0 days 00:06:41.685162
loss_total                         788.829774
loss_critic                       1052.170198
loss_actor                        -264.531993
memory_size                        331985.586 

=== epoch 4/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:42,  4.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   73
episode_length                      127.90411
returns                            -52.615506
return_std                         108.233544
average_reward                      -0.400561
round_time             0 days 00:06:40.931968
episodes_test                            19.0
episode_length_test                524.842105
returns_test                       360.945125
return_std_test                    355.675475
average_reward_test                  0.689542
round_time_test        0 days 00:00:10.983980
round_time_total       0 days 00:06:40.933072
loss_total                         798.618917
loss_critic                       1064.345655
loss_actor                        -264.288106
memory_size                       333616.8285 

=== epoch 4/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:35,  4.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   76
episode_length                     127.197368
returns                            -49.474484
return_std                         100.997258
average_reward                      -0.390102
round_time             0 days 00:06:44.072502
episodes_test                            19.0
episode_length_test                514.894737
returns_test                       359.165181
return_std_test                    354.251336
average_reward_test                  0.701382
round_time_test        0 days 00:00:10.951580
round_time_total       0 days 00:06:44.074034
loss_total                         799.875887
loss_critic                       1065.961307
loss_actor                        -264.465861
memory_size                        335385.246 

=== epoch 4/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:22,  4.51it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.94it/s]
episodes                                   64
episode_length                     142.453125
returns                            -53.990733
return_std                         109.509653
average_reward                      -0.382179
round_time             0 days 00:06:45.779901
episodes_test                            30.0
episode_length_test                328.966667
returns_test                       204.440952
return_std_test                    295.279861
average_reward_test                  0.623305
round_time_test        0 days 00:00:10.838436
round_time_total       0 days 00:06:45.781268
loss_total                         793.565626
loss_critic                       1058.162115
loss_actor                        -264.820399
memory_size                        337203.672 

=== epoch 4/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   94
episode_length                     101.319149
returns                            -39.799315
return_std                          87.294743
average_reward                      -0.390494
round_time             0 days 00:06:42.517112
episodes_test                            28.0
episode_length_test                339.285714
returns_test                       222.461263
return_std_test                    320.614874
average_reward_test                  0.638893
round_time_test        0 days 00:00:10.889870
round_time_total       0 days 00:06:42.518244
loss_total                         806.558001
loss_critic                       1074.493534
loss_actor                        -265.184201
memory_size                        338893.662 

=== epoch 4/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:57,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   92
episode_length                     101.380435
returns                            -38.557471
return_std                          86.452843
average_reward                      -0.379623
round_time             0 days 00:06:45.239157
episodes_test                            17.0
episode_length_test                531.117647
returns_test                       411.281914
return_std_test                    361.720739
average_reward_test                  0.754049
round_time_test        0 days 00:00:10.971732
round_time_total       0 days 00:06:45.240334
loss_total                         809.569989
loss_critic                       1078.282044
loss_actor                        -265.278306
memory_size                       340478.0175 

=== epoch 4/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:32,  5.08it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   68
episode_length                     133.044118
returns                            -49.975285
return_std                         109.965214
average_reward                      -0.385727
round_time             0 days 00:06:43.833339
episodes_test                            30.0
episode_length_test                328.066667
returns_test                       209.912114
return_std_test                     293.45886
average_reward_test                  0.640115
round_time_test        0 days 00:00:11.057693
round_time_total       0 days 00:06:43.834441
loss_total                         803.677804
loss_critic                       1071.188723
loss_actor                        -266.365944
memory_size                        342342.835 

=== epoch 4/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:20,  4.52it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.94it/s]
episodes                                   70
episode_length                          139.5
returns                            -55.691872
return_std                         121.311603
average_reward                      -0.399308
round_time             0 days 00:06:45.705458
episodes_test                            24.0
episode_length_test                   414.875
returns_test                       308.379405
return_std_test                    342.458652
average_reward_test                  0.740281
round_time_test        0 days 00:00:10.805676
round_time_total       0 days 00:06:45.706905
loss_total                         797.032564
loss_critic                       1063.032877
loss_actor                        -266.968751
memory_size                       344220.8825 

=== epoch 4/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:21,  4.52it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   84
episode_length                     112.428571
returns                            -43.589097
return_std                         103.390139
average_reward                       -0.38932
round_time             0 days 00:06:45.898430
episodes_test                            30.0
episode_length_test                308.533333
returns_test                       185.699877
return_std_test                    227.671474
average_reward_test                  0.614642
round_time_test        0 days 00:00:10.743583
round_time_total       0 days 00:06:45.899783
loss_total                         782.393558
loss_critic                       1044.851859
loss_actor                        -267.439713
memory_size                       345985.1375 

=== epoch 4/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:55,  4.19it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   57
episode_length                     174.350877
returns                            -67.401425
return_std                         137.047038
average_reward                      -0.389268
round_time             0 days 00:06:42.259908
episodes_test                            20.0
episode_length_test                     471.2
returns_test                       289.418671
return_std_test                     305.84265
average_reward_test                  0.625056
round_time_test        0 days 00:00:11.159880
round_time_total       0 days 00:06:42.261396
loss_total                          807.33664
loss_critic                       1075.985725
loss_actor                        -267.259775
memory_size                        347706.621 

=== epoch 4/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.92it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.92it/s]
episodes                                   63
episode_length                     151.968254
returns                            -60.893087
return_std                         125.908381
average_reward                       -0.39856
round_time             0 days 00:06:46.710412
episodes_test                            20.0
episode_length_test                    472.35
returns_test                       342.630802
return_std_test                     342.90774
average_reward_test                  0.730635
round_time_test        0 days 00:00:10.504038
round_time_total       0 days 00:06:46.711525
loss_total                         801.497218
loss_critic                       1068.552116
loss_actor                        -266.722441
memory_size                       349422.7755 

=== epoch 4/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   72
episode_length                     126.055556
returns                            -49.578282
return_std                         108.970857
average_reward                      -0.394115
round_time             0 days 00:06:44.938029
episodes_test                            21.0
episode_length_test                     449.0
returns_test                        337.94011
return_std_test                    361.250689
average_reward_test                  0.756966
round_time_test        0 days 00:00:10.853918
round_time_total       0 days 00:06:44.939270
loss_total                         812.149507
loss_critic                         1081.7064
loss_actor                        -266.078138
memory_size                        351160.051 

=== epoch 4/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.98it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.94it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   85
episode_length                     102.870588
returns                            -40.866536
return_std                          90.383579
average_reward                      -0.405082
round_time             0 days 00:06:45.763130
episodes_test                            15.0
episode_length_test                637.866667
returns_test                       395.355362
return_std_test                    355.725424
average_reward_test                  0.604361
round_time_test        0 days 00:00:10.790126
round_time_total       0 days 00:06:45.764232
loss_total                         815.464495
loss_critic                       1086.085012
loss_actor                        -267.017646
memory_size                        352880.392 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 5/10 ===== round 1/50 ======================================
  1%|          | 11/2000 [00:02<06:08,  5.40it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:02<00:00,  5.51it/s]
episodes                                   32
episode_length                       50.15625
returns                            -17.016232
return_std                          24.114044
average_reward                      -0.338518
round_time             0 days 00:06:02.884781
episodes_test                            23.0
episode_length_test                430.130435
returns_test                       283.428785
return_std_test                    338.450359
average_reward_test                   0.65779
round_time_test        0 days 00:00:10.862539
round_time_total       0 days 00:06:02.886135
loss_total                         823.980798
loss_critic                       1096.674109
loss_actor                        -266.792519
memory_size                        354397.274 

=== epoch 5/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:02<00:00,  5.51it/s]
episodes                                   41
episode_length                      73.439024
returns                            -27.597533
return_std                          65.557518
average_reward                      -0.385064
round_time             0 days 00:06:03.452451
episodes_test                            28.0
episode_length_test                     354.0
returns_test                        228.98732
return_std_test                    290.196602
average_reward_test                  0.646942
round_time_test        0 days 00:00:10.781145
round_time_total       0 days 00:06:03.453757
loss_total                         826.286044
loss_critic                       1099.654762
loss_actor                        -267.188902
memory_size                       356078.7585 

=== epoch 5/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:17,  5.28it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.45it/s]
episodes                                   49
episode_length                     106.306122
returns                            -42.565593
return_std                         103.199043
average_reward                      -0.408952
round_time             0 days 00:06:07.364754
episodes_test                            20.0
episode_length_test                     456.9
returns_test                       325.630593
return_std_test                    308.740521
average_reward_test                  0.732486
round_time_test        0 days 00:00:10.693195
round_time_total       0 days 00:06:07.366174
loss_total                         821.494058
loss_critic                       1093.759405
loss_actor                        -267.567401
memory_size                        357917.939 

=== epoch 5/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   64
episode_length                      122.46875
returns                            -50.964255
return_std                          113.41938
average_reward                         -0.419
round_time             0 days 00:06:10.798439
episodes_test                            15.0
episode_length_test                     629.2
returns_test                       321.415124
return_std_test                    272.858184
average_reward_test                  0.494693
round_time_test        0 days 00:00:10.882660
round_time_total       0 days 00:06:10.800019
loss_total                         816.283815
loss_critic                       1087.279684
loss_actor                        -267.699731
memory_size                        359749.003 

=== epoch 5/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:11<00:00,  5.38it/s]
episodes                                   70
episode_length                          142.6
returns                            -60.589319
return_std                         125.433572
average_reward                      -0.424659
round_time             0 days 00:06:12.083382
episodes_test                            19.0
episode_length_test                499.736842
returns_test                       348.909547
return_std_test                    341.701039
average_reward_test                  0.707034
round_time_test        0 days 00:00:10.873938
round_time_total       0 days 00:06:12.084479
loss_total                         808.174939
loss_critic                       1077.124984
loss_actor                        -267.625313
memory_size                       361594.9895 

=== epoch 5/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:21,  4.51it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.36it/s]
episodes                                   54
episode_length                     158.203704
returns                            -70.323006
return_std                         131.186248
average_reward                        -0.4413
round_time             0 days 00:06:13.408237
episodes_test                            27.0
episode_length_test                333.444444
returns_test                       267.167656
return_std_test                    349.220143
average_reward_test                  0.809857
round_time_test        0 days 00:00:10.864000
round_time_total       0 days 00:06:13.409392
loss_total                         827.462259
loss_critic                       1101.299378
loss_actor                        -267.886292
memory_size                        363320.503 

=== epoch 5/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:18,  5.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.35it/s]
episodes                                   51
episode_length                     181.431373
returns                            -80.418258
return_std                         139.977927
average_reward                      -0.443296
round_time             0 days 00:06:14.137851
episodes_test                            23.0
episode_length_test                434.217391
returns_test                       264.640568
return_std_test                    304.486057
average_reward_test                  0.609566
round_time_test        0 days 00:00:10.947369
round_time_total       0 days 00:06:14.138940
loss_total                         822.613433
loss_critic                        1095.11138
loss_actor                        -267.378426
memory_size                        365163.905 

=== epoch 5/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.29it/s]
episodes                                   53
episode_length                     183.886792
returns                            -80.651747
return_std                         131.918751
average_reward                      -0.439004
round_time             0 days 00:06:18.577872
episodes_test                            19.0
episode_length_test                524.263158
returns_test                       338.046214
return_std_test                    308.100832
average_reward_test                  0.645279
round_time_test        0 days 00:00:10.939433
round_time_total       0 days 00:06:18.579062
loss_total                         813.608757
loss_critic                       1083.907963
loss_actor                        -267.588134
memory_size                         367074.15 

=== epoch 5/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:59,  4.75it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.29it/s]
episodes                                   63
episode_length                     145.349206
returns                            -60.217879
return_std                         114.178379
average_reward                      -0.414046
round_time             0 days 00:06:18.835808
episodes_test                            24.0
episode_length_test                409.708333
returns_test                       321.663039
return_std_test                    349.612618
average_reward_test                  0.778108
round_time_test        0 days 00:00:10.776569
round_time_total       0 days 00:06:18.836891
loss_total                         828.654981
loss_critic                       1102.556872
loss_actor                        -266.952662
memory_size                       368817.1745 

=== epoch 5/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 10/2000 [00:01<06:09,  5.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.30it/s]
episodes                                   59
episode_length                     151.610169
returns                            -61.599665
return_std                         120.549319
average_reward                      -0.418666
round_time             0 days 00:06:17.802723
episodes_test                            25.0
episode_length_test                    377.48
returns_test                       278.418696
return_std_test                    332.790432
average_reward_test                  0.745046
round_time_test        0 days 00:00:10.751074
round_time_total       0 days 00:06:17.803927
loss_total                         823.788173
loss_critic                         1096.5499
loss_actor                        -267.258809
memory_size                        370535.351 

=== epoch 5/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:44,  4.92it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:19<00:00,  5.27it/s]
episodes                                   56
episode_length                     175.035714
returns                             -77.51153
return_std                         137.637652
average_reward                      -0.438501
round_time             0 days 00:06:20.313644
episodes_test                            31.0
episode_length_test                299.193548
returns_test                       213.354272
return_std_test                    307.995207
average_reward_test                  0.721686
round_time_test        0 days 00:00:10.826660
round_time_total       0 days 00:06:20.314822
loss_total                         823.641875
loss_critic                       1096.448811
loss_actor                        -267.585944
memory_size                        372422.581 

=== epoch 5/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.22it/s]
episodes                                   63
episode_length                     147.031746
returns                            -64.080027
return_std                         121.634107
average_reward                      -0.437295
round_time             0 days 00:06:23.541488
episodes_test                            20.0
episode_length_test                     497.6
returns_test                        375.61741
return_std_test                    376.428012
average_reward_test                  0.754556
round_time_test        0 days 00:00:10.984891
round_time_total       0 days 00:06:23.542935
loss_total                         839.346646
loss_critic                       1116.110446
loss_actor                        -267.708632
memory_size                        374167.566 

=== epoch 5/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:48,  4.87it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   58
episode_length                     165.362069
returns                            -71.647303
return_std                         136.539693
average_reward                      -0.431847
round_time             0 days 00:06:23.817580
episodes_test                            16.0
episode_length_test                  583.9375
returns_test                       400.313644
return_std_test                    323.996546
average_reward_test                  0.689266
round_time_test        0 days 00:00:10.947744
round_time_total       0 days 00:06:23.818676
loss_total                         825.608648
loss_critic                       1098.879023
loss_actor                        -267.472927
memory_size                        376032.497 

=== epoch 5/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:46,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   60
episode_length                     149.183333
returns                            -66.746495
return_std                         124.355095
average_reward                      -0.446564
round_time             0 days 00:06:24.895446
episodes_test                            23.0
episode_length_test                394.304348
returns_test                       252.292856
return_std_test                    283.435147
average_reward_test                  0.643629
round_time_test        0 days 00:00:10.807333
round_time_total       0 days 00:06:24.896542
loss_total                          845.78347
loss_critic                       1124.117197
loss_actor                         -267.55151
memory_size                        377815.294 

=== epoch 5/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:56,  4.78it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.26it/s]
episodes                                   67
episode_length                     138.746269
returns                            -60.618601
return_std                         114.947594
average_reward                      -0.437751
round_time             0 days 00:06:21.175828
episodes_test                            25.0
episode_length_test                    390.16
returns_test                       258.915715
return_std_test                    310.361448
average_reward_test                  0.669145
round_time_test        0 days 00:00:10.872895
round_time_total       0 days 00:06:21.176943
loss_total                         841.360309
loss_critic                       1118.405282
loss_actor                        -266.819669
memory_size                        379409.081 

=== epoch 5/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.18it/s]
episodes                                   91
episode_length                     109.736264
returns                            -44.046526
return_std                          98.285848
average_reward                      -0.401492
round_time             0 days 00:06:26.824028
episodes_test                            19.0
episode_length_test                485.526316
returns_test                       383.857463
return_std_test                    381.561057
average_reward_test                  0.794262
round_time_test        0 days 00:00:10.931522
round_time_total       0 days 00:06:26.825496
loss_total                         847.593079
loss_critic                       1126.136783
loss_actor                        -266.581823
memory_size                       381130.0915 

=== epoch 5/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.98it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   82
episode_length                      105.04878
returns                            -41.835768
return_std                          95.308137
average_reward                      -0.389667
round_time             0 days 00:06:23.848181
episodes_test                            27.0
episode_length_test                352.777778
returns_test                       265.624931
return_std_test                    329.974063
average_reward_test                  0.750901
round_time_test        0 days 00:00:10.975373
round_time_total       0 days 00:06:23.849627
loss_total                         852.590388
loss_critic                       1132.350295
loss_actor                        -266.449316
memory_size                         382710.26 

=== epoch 5/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.18it/s]
episodes                                   92
episode_length                     100.619565
returns                            -37.872983
return_std                          84.078858
average_reward                      -0.378446
round_time             0 days 00:06:26.415819
episodes_test                            22.0
episode_length_test                     443.5
returns_test                       316.743447
return_std_test                    331.375877
average_reward_test                  0.708914
round_time_test        0 days 00:00:10.763161
round_time_total       0 days 00:06:26.416908
loss_total                         837.378355
loss_critic                        1113.36434
loss_actor                         -266.56566
memory_size                       384602.4035 

=== epoch 5/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:02,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   77
episode_length                     121.792208
returns                            -47.250527
return_std                          98.793273
average_reward                      -0.389344
round_time             0 days 00:06:28.313648
episodes_test                            26.0
episode_length_test                364.576923
returns_test                       260.192789
return_std_test                    345.295325
average_reward_test                  0.712395
round_time_test        0 days 00:00:10.703757
round_time_total       0 days 00:06:28.314744
loss_total                         845.331644
loss_critic                       1123.425223
loss_actor                        -267.042747
memory_size                       386358.0565 

=== epoch 5/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:10,  4.63it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   82
episode_length                     117.439024
returns                            -44.035204
return_std                          92.543248
average_reward                      -0.378107
round_time             0 days 00:06:25.789276
episodes_test                            21.0
episode_length_test                443.142857
returns_test                       318.231715
return_std_test                    347.289222
average_reward_test                    0.7255
round_time_test        0 days 00:00:10.843785
round_time_total       0 days 00:06:25.790378
loss_total                         840.810539
loss_critic                        1117.87079
loss_actor                        -267.430543
memory_size                       388167.0975 

=== epoch 5/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:08,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.15it/s]
episodes                                   59
episode_length                      160.59322
returns                            -62.717465
return_std                         115.740494
average_reward                      -0.385517
round_time             0 days 00:06:28.604665
episodes_test                            29.0
episode_length_test                341.931034
returns_test                       260.358516
return_std_test                     337.39136
average_reward_test                  0.762335
round_time_test        0 days 00:00:10.763551
round_time_total       0 days 00:06:28.605767
loss_total                         845.511635
loss_critic                       1123.657064
loss_actor                        -267.070154
memory_size                        389904.457 

=== epoch 5/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   69
episode_length                     142.536232
returns                            -56.630575
return_std                         109.442016
average_reward                      -0.396784
round_time             0 days 00:06:30.960074
episodes_test                            17.0
episode_length_test                567.294118
returns_test                       451.219677
return_std_test                    349.398495
average_reward_test                  0.801276
round_time_test        0 days 00:00:10.934318
round_time_total       0 days 00:06:30.961375
loss_total                         867.990643
loss_critic                       1151.720191
loss_actor                        -266.927626
memory_size                       391683.5125 

=== epoch 5/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   78
episode_length                     117.179487
returns                            -48.701336
return_std                         102.917456
average_reward                      -0.412617
round_time             0 days 00:06:28.451457
episodes_test                            24.0
episode_length_test                404.708333
returns_test                       293.703595
return_std_test                     333.76748
average_reward_test                  0.725528
round_time_test        0 days 00:00:11.029814
round_time_total       0 days 00:06:28.452559
loss_total                          840.69287
loss_critic                       1117.754728
loss_actor                        -267.554634
memory_size                       393383.4645 

=== epoch 5/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.13it/s]
episodes                                   79
episode_length                     114.924051
returns                            -46.424702
return_std                         104.037631
average_reward                      -0.400133
round_time             0 days 00:06:30.815367
episodes_test                            17.0
episode_length_test                583.529412
returns_test                       467.092915
return_std_test                    386.738152
average_reward_test                  0.795952
round_time_test        0 days 00:00:10.760037
round_time_total       0 days 00:06:30.816788
loss_total                         854.688336
loss_critic                       1135.338577
loss_actor                        -267.912712
memory_size                        394995.843 

=== epoch 5/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:15,  4.58it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.10it/s]
episodes                                   77
episode_length                     118.987013
returns                            -49.798308
return_std                         108.295315
average_reward                      -0.411028
round_time             0 days 00:06:33.098444
episodes_test                            27.0
episode_length_test                 334.62963
returns_test                       232.917713
return_std_test                    316.099104
average_reward_test                   0.70707
round_time_test        0 days 00:00:10.767808
round_time_total       0 days 00:06:33.099551
loss_total                         855.535169
loss_critic                       1136.421104
loss_actor                        -268.008649
memory_size                       396803.4425 

=== epoch 5/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:58,  4.76it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   83
episode_length                     113.963855
returns                            -46.693402
return_std                          105.15664
average_reward                      -0.409724
round_time             0 days 00:06:31.899614
episodes_test                            25.0
episode_length_test                    392.76
returns_test                       274.518969
return_std_test                    292.589251
average_reward_test                  0.700042
round_time_test        0 days 00:00:10.741654
round_time_total       0 days 00:06:31.900983
loss_total                         871.425148
loss_critic                       1156.362442
loss_actor                        -268.324097
memory_size                        398592.722 

=== epoch 5/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   95
episode_length                     104.052632
returns                            -43.791377
return_std                          98.213947
average_reward                      -0.419418
round_time             0 days 00:06:33.981512
episodes_test                            25.0
episode_length_test                     383.0
returns_test                       267.655135
return_std_test                    308.143097
average_reward_test                  0.706506
round_time_test        0 days 00:00:10.691025
round_time_total       0 days 00:06:33.982668
loss_total                         873.845972
loss_critic                       1159.213016
loss_actor                        -267.622281
memory_size                        400252.905 

=== epoch 5/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:36,  4.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   82
episode_length                     116.682927
returns                            -47.391959
return_std                         100.435943
average_reward                      -0.408246
round_time             0 days 00:06:34.106541
episodes_test                            18.0
episode_length_test                534.888889
returns_test                       387.674164
return_std_test                    323.611425
average_reward_test                  0.722876
round_time_test        0 days 00:00:10.881809
round_time_total       0 days 00:06:34.107857
loss_total                         883.286253
loss_critic                        1171.08917
loss_actor                        -267.925494
memory_size                       401811.5525 

=== epoch 5/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:55,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   78
episode_length                     122.192308
returns                            -51.750242
return_std                         105.974584
average_reward                      -0.426193
round_time             0 days 00:06:34.995120
episodes_test                            32.0
episode_length_test                   312.375
returns_test                       234.967881
return_std_test                    285.100613
average_reward_test                   0.75186
round_time_test        0 days 00:00:10.871048
round_time_total       0 days 00:06:34.996256
loss_total                         860.952534
loss_critic                        1143.16596
loss_actor                        -267.901245
memory_size                        403695.192 

=== epoch 5/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   75
episode_length                     125.746667
returns                            -51.931844
return_std                         106.024903
average_reward                      -0.411065
round_time             0 days 00:06:37.430583
episodes_test                            14.0
episode_length_test                675.642857
returns_test                       536.269105
return_std_test                    374.898817
average_reward_test                  0.792953
round_time_test        0 days 00:00:10.986976
round_time_total       0 days 00:06:37.431688
loss_total                         855.009761
loss_critic                       1135.918716
loss_actor                        -268.626133
memory_size                       405542.8185 

=== epoch 5/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.05it/s]
episodes                                   64
episode_length                      140.71875
returns                            -59.986929
return_std                         112.389544
average_reward                      -0.419826
round_time             0 days 00:06:36.352366
episodes_test                            15.0
episode_length_test                659.266667
returns_test                         554.4905
return_std_test                    375.844884
average_reward_test                  0.840856
round_time_test        0 days 00:00:10.986938
round_time_total       0 days 00:06:36.353478
loss_total                         867.288982
loss_critic                       1151.337684
loss_actor                        -268.905902
memory_size                       407412.2795 

=== epoch 5/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:56,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   56
episode_length                     176.303571
returns                            -70.413819
return_std                         127.093904
average_reward                      -0.398776
round_time             0 days 00:06:37.485115
episodes_test                            20.0
episode_length_test                    469.45
returns_test                       360.689611
return_std_test                    358.903113
average_reward_test                  0.757186
round_time_test        0 days 00:00:11.145084
round_time_total       0 days 00:06:37.486403
loss_total                         876.763438
loss_critic                       1163.329395
loss_actor                         -269.50047
memory_size                       409154.5045 

=== epoch 5/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:42,  4.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   54
episode_length                     165.777778
returns                            -66.911676
return_std                         118.354469
average_reward                      -0.407042
round_time             0 days 00:06:38.904440
episodes_test                            20.0
episode_length_test                    455.95
returns_test                       333.946298
return_std_test                    361.612012
average_reward_test                  0.707723
round_time_test        0 days 00:00:10.804375
round_time_total       0 days 00:06:38.905538
loss_total                         874.855418
loss_critic                       1160.860864
loss_actor                        -269.166455
memory_size                        410923.113 

=== epoch 5/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 3/2000 [00:01<15:37,  2.13it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   60
episode_length                          151.1
returns                            -60.571768
return_std                         117.563202
average_reward                      -0.395016
round_time             0 days 00:06:39.231953
episodes_test                            24.0
episode_length_test                388.833333
returns_test                       277.322229
return_std_test                    311.970177
average_reward_test                  0.691295
round_time_test        0 days 00:00:10.751233
round_time_total       0 days 00:06:39.233031
loss_total                         869.009921
loss_critic                       1153.482447
loss_actor                         -268.88026
memory_size                        412712.204 

=== epoch 5/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   65
episode_length                     136.769231
returns                             -52.81096
return_std                         112.358959
average_reward                       -0.38854
round_time             0 days 00:06:40.118519
episodes_test                            25.0
episode_length_test                     399.2
returns_test                       308.404585
return_std_test                    362.495542
average_reward_test                  0.772495
round_time_test        0 days 00:00:10.939586
round_time_total       0 days 00:06:40.119910
loss_total                         870.368297
loss_critic                       1155.146304
loss_actor                        -268.743814
memory_size                        414407.194 

=== epoch 5/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.94it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   70
episode_length                     141.485714
returns                            -50.541378
return_std                         110.302828
average_reward                      -0.359318
round_time             0 days 00:06:40.260582
episodes_test                            16.0
episode_length_test                   615.125
returns_test                       401.618782
return_std_test                    308.359691
average_reward_test                  0.653677
round_time_test        0 days 00:00:10.822378
round_time_total       0 days 00:06:40.261676
loss_total                          866.39191
loss_critic                       1150.379696
loss_actor                        -269.559311
memory_size                       416288.7735 

=== epoch 5/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:47,  4.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   76
episode_length                     113.381579
returns                            -39.321317
return_std                          88.912989
average_reward                      -0.352887
round_time             0 days 00:06:41.175536
episodes_test                            28.0
episode_length_test                     346.0
returns_test                       270.683509
return_std_test                    306.006499
average_reward_test                  0.771984
round_time_test        0 days 00:00:10.742968
round_time_total       0 days 00:06:41.176705
loss_total                         882.204524
loss_critic                       1169.769043
loss_actor                        -268.053637
memory_size                        417868.319 

=== epoch 5/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:16,  4.56it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   79
episode_length                     108.696203
returns                            -35.892714
return_std                          85.375662
average_reward                      -0.347324
round_time             0 days 00:06:40.333885
episodes_test                            18.0
episode_length_test                502.333333
returns_test                       393.457581
return_std_test                    369.200534
average_reward_test                  0.796788
round_time_test        0 days 00:00:10.985137
round_time_total       0 days 00:06:40.335145
loss_total                         876.807647
loss_critic                       1163.118203
loss_actor                        -268.434656
memory_size                        419606.663 

=== epoch 5/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:10,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   74
episode_length                     128.324324
returns                            -42.519724
return_std                          98.770576
average_reward                      -0.332826
round_time             0 days 00:06:40.287625
episodes_test                            16.0
episode_length_test                  575.0625
returns_test                       373.691237
return_std_test                    335.987102
average_reward_test                  0.669203
round_time_test        0 days 00:00:10.988516
round_time_total       0 days 00:06:40.288930
loss_total                         870.880832
loss_critic                       1155.917645
loss_actor                        -269.266502
memory_size                       421396.6595 

=== epoch 5/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   62
episode_length                     145.435484
returns                            -51.130871
return_std                          111.43691
average_reward                      -0.357783
round_time             0 days 00:06:41.087466
episodes_test                            18.0
episode_length_test                541.833333
returns_test                       430.423788
return_std_test                    394.892027
average_reward_test                  0.793446
round_time_test        0 days 00:00:10.883829
round_time_total       0 days 00:06:41.088560
loss_total                         858.291617
loss_critic                        1140.44286
loss_actor                         -270.31343
memory_size                        423331.707 

=== epoch 5/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:24,  5.18it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   77
episode_length                     128.428571
returns                            -55.209787
return_std                         107.231008
average_reward                      -0.426258
round_time             0 days 00:06:41.978724
episodes_test                            23.0
episode_length_test                407.217391
returns_test                       306.543384
return_std_test                    352.828999
average_reward_test                  0.753525
round_time_test        0 days 00:00:10.931875
round_time_total       0 days 00:06:41.979828
loss_total                         866.046122
loss_critic                       1150.099776
loss_actor                        -270.168583
memory_size                        425135.629 

=== epoch 5/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:18,  4.54it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   59
episode_length                      146.20339
returns                            -64.733501
return_std                         119.710914
average_reward                      -0.436814
round_time             0 days 00:06:42.950586
episodes_test                            22.0
episode_length_test                425.590909
returns_test                       302.388875
return_std_test                    334.916226
average_reward_test                  0.713421
round_time_test        0 days 00:00:10.702199
round_time_total       0 days 00:06:42.951720
loss_total                         890.289779
loss_critic                       1180.488113
loss_actor                        -270.503627
memory_size                       426730.1945 

=== epoch 5/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:34,  5.05it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   68
episode_length                     131.691176
returns                            -56.471007
return_std                         109.778311
average_reward                      -0.426762
round_time             0 days 00:06:40.344363
episodes_test                            23.0
episode_length_test                418.695652
returns_test                       297.575374
return_std_test                    342.611143
average_reward_test                  0.699357
round_time_test        0 days 00:00:10.689877
round_time_total       0 days 00:06:40.345471
loss_total                         874.969185
loss_critic                       1161.487939
loss_actor                        -271.105907
memory_size                        428471.391 

=== epoch 5/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:35,  4.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   73
episode_length                     117.794521
returns                            -53.143861
return_std                         100.757132
average_reward                      -0.449948
round_time             0 days 00:06:45.032192
episodes_test                            27.0
episode_length_test                368.925926
returns_test                       257.389182
return_std_test                    315.629453
average_reward_test                  0.697656
round_time_test        0 days 00:00:10.974502
round_time_total       0 days 00:06:45.033320
loss_total                         871.712762
loss_critic                       1157.498911
loss_actor                        -271.431914
memory_size                       430189.4615 

=== epoch 5/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:34,  4.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   75
episode_length                         115.44
returns                             -51.98449
return_std                          99.280081
average_reward                      -0.443864
round_time             0 days 00:06:43.112097
episodes_test                            15.0
episode_length_test                647.866667
returns_test                       466.027484
return_std_test                      376.7785
average_reward_test                  0.708466
round_time_test        0 days 00:00:10.891365
round_time_total       0 days 00:06:43.113354
loss_total                          885.98077
loss_critic                       1175.318813
loss_actor                        -271.371483
memory_size                        432066.398 

=== epoch 5/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:53,  4.21it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   66
episode_length                     150.666667
returns                             -59.30519
return_std                         121.136517
average_reward                      -0.399639
round_time             0 days 00:06:46.190423
episodes_test                            24.0
episode_length_test                383.666667
returns_test                       322.730821
return_std_test                    386.963676
average_reward_test                  0.847047
round_time_test        0 days 00:00:10.784146
round_time_total       0 days 00:06:46.191519
loss_total                         880.166882
loss_critic                       1168.019237
loss_actor                        -271.242617
memory_size                       433897.3385 

=== epoch 5/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:25,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.92it/s]
episodes                                   66
episode_length                     139.030303
returns                             -55.96603
return_std                         116.352372
average_reward                      -0.401053
round_time             0 days 00:06:47.094966
episodes_test                            30.0
episode_length_test                318.666667
returns_test                        185.50299
return_std_test                    240.417112
average_reward_test                  0.591934
round_time_test        0 days 00:00:10.890893
round_time_total       0 days 00:06:47.096477
loss_total                         885.520049
loss_critic                        1174.73359
loss_actor                        -271.334195
memory_size                        435650.206 

=== epoch 5/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.68it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   60
episode_length                          164.3
returns                            -67.259166
return_std                         126.928259
average_reward                      -0.409472
round_time             0 days 00:06:46.249407
episodes_test                            16.0
episode_length_test                  617.6875
returns_test                       450.336107
return_std_test                    361.389424
average_reward_test                  0.727498
round_time_test        0 days 00:00:10.866687
round_time_total       0 days 00:06:46.250920
loss_total                         874.204413
loss_critic                       1160.644799
loss_actor                        -271.557216
memory_size                        437552.749 

=== epoch 5/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:00,  4.74it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:47<00:00,  4.91it/s]
episodes                                   58
episode_length                     169.241379
returns                            -67.113513
return_std                         125.264118
average_reward                      -0.395481
round_time             0 days 00:06:47.720288
episodes_test                            20.0
episode_length_test                     458.3
returns_test                       366.163953
return_std_test                    368.584391
average_reward_test                  0.802921
round_time_test        0 days 00:00:10.950077
round_time_total       0 days 00:06:47.721399
loss_total                          874.10105
loss_critic                       1160.354121
loss_actor                        -270.911316
memory_size                        439282.462 

=== epoch 5/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   62
episode_length                     144.435484
returns                            -57.016413
return_std                         115.671501
average_reward                      -0.392988
round_time             0 days 00:06:46.364988
episodes_test                            15.0
episode_length_test                     666.4
returns_test                       536.891324
return_std_test                    374.074208
average_reward_test                  0.805264
round_time_test        0 days 00:00:10.635742
round_time_total       0 days 00:06:46.366279
loss_total                         888.275465
loss_critic                       1178.162475
loss_actor                        -271.272657
memory_size                         441076.54 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 6/10 ===== round 1/50 ======================================
  1%|          | 11/2000 [00:02<06:25,  5.16it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.50it/s]
episodes                                   17
episode_length                     114.176471
returns                            -40.157967
return_std                         106.378548
average_reward                      -0.357106
round_time             0 days 00:06:03.617126
episodes_test                            17.0
episode_length_test                553.294118
returns_test                       422.508772
return_std_test                    346.511852
average_reward_test                  0.772079
round_time_test        0 days 00:00:11.052286
round_time_total       0 days 00:06:03.618317
loss_total                         870.247088
loss_critic                       1155.713255
loss_actor                        -271.617664
memory_size                        442857.711 

=== epoch 6/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:11,  5.35it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:04<00:00,  5.49it/s]
episodes                                   31
episode_length                     115.483871
returns                            -41.612297
return_std                         101.626115
average_reward                      -0.361247
round_time             0 days 00:06:05.081294
episodes_test                            18.0
episode_length_test                     555.0
returns_test                       420.098589
return_std_test                    378.328478
average_reward_test                  0.756416
round_time_test        0 days 00:00:10.954580
round_time_total       0 days 00:06:05.082618
loss_total                         882.306607
loss_critic                       1170.801517
loss_actor                        -271.673113
memory_size                       444630.0185 

=== epoch 6/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:03,  4.70it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:05<00:00,  5.47it/s]
episodes                                   44
episode_length                     130.272727
returns                            -47.990013
return_std                         108.305663
average_reward                      -0.371654
round_time             0 days 00:06:06.360316
episodes_test                            18.0
episode_length_test                549.388889
returns_test                        344.93065
return_std_test                    313.203745
average_reward_test                  0.628909
round_time_test        0 days 00:00:10.849368
round_time_total       0 days 00:06:06.361599
loss_total                         882.075684
loss_critic                       1170.679473
loss_actor                        -272.339556
memory_size                        446398.604 

=== epoch 6/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:14,  5.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:11<00:00,  5.38it/s]
episodes                                   50
episode_length                         145.38
returns                            -55.123258
return_std                         112.925695
average_reward                      -0.376309
round_time             0 days 00:06:12.125668
episodes_test                            34.0
episode_length_test                291.323529
returns_test                        193.70927
return_std_test                    240.308493
average_reward_test                  0.666322
round_time_test        0 days 00:00:10.930884
round_time_total       0 days 00:06:12.126786
loss_total                         886.252482
loss_critic                       1175.740375
loss_actor                        -271.699173
memory_size                       448212.5865 

=== epoch 6/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:20,  4.52it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:11<00:00,  5.38it/s]
episodes                                   67
episode_length                     148.134328
returns                            -56.115798
return_std                         116.137026
average_reward                      -0.383251
round_time             0 days 00:06:12.311139
episodes_test                            22.0
episode_length_test                432.090909
returns_test                       338.188366
return_std_test                    344.348642
average_reward_test                   0.78335
round_time_test        0 days 00:00:10.903050
round_time_total       0 days 00:06:12.312254
loss_total                          879.90668
loss_critic                        1168.03377
loss_actor                        -272.601761
memory_size                        450009.596 

=== epoch 6/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:11,  5.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.39it/s]
episodes                                   59
episode_length                     142.508475
returns                            -55.491733
return_std                         113.848777
average_reward                      -0.389906
round_time             0 days 00:06:11.552609
episodes_test                            15.0
episode_length_test                609.733333
returns_test                       460.470595
return_std_test                    349.806996
average_reward_test                  0.755027
round_time_test        0 days 00:00:10.810688
round_time_total       0 days 00:06:11.554216
loss_total                         891.316245
loss_critic                       1182.305573
loss_actor                        -272.641148
memory_size                        451790.887 

=== epoch 6/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.35it/s]
episodes                                   55
episode_length                     157.090909
returns                            -58.473775
return_std                         115.623469
average_reward                      -0.382445
round_time             0 days 00:06:14.552517
episodes_test                            25.0
episode_length_test                    371.12
returns_test                       250.969525
return_std_test                    293.728314
average_reward_test                  0.691821
round_time_test        0 days 00:00:10.752122
round_time_total       0 days 00:06:14.553775
loss_total                         886.852665
loss_critic                       1176.474433
loss_actor                        -271.634489
memory_size                        453645.218 

=== epoch 6/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:57,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   56
episode_length                     177.964286
returns                            -66.479694
return_std                         131.786751
average_reward                      -0.375642
round_time             0 days 00:06:19.363616
episodes_test                            19.0
episode_length_test                523.842105
returns_test                       410.719458
return_std_test                    369.032348
average_reward_test                  0.782425
round_time_test        0 days 00:00:11.083488
round_time_total       0 days 00:06:19.364724
loss_total                         906.307091
loss_critic                       1200.796465
loss_actor                        -271.650485
memory_size                       455487.1865 

=== epoch 6/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.31it/s]
episodes                                   89
episode_length                     109.044944
returns                            -40.189878
return_std                          96.333531
average_reward                      -0.370392
round_time             0 days 00:06:17.008627
episodes_test                            18.0
episode_length_test                524.555556
returns_test                       370.929909
return_std_test                    296.700159
average_reward_test                  0.695258
round_time_test        0 days 00:00:10.843737
round_time_total       0 days 00:06:17.009738
loss_total                         908.551588
loss_critic                       1203.350038
loss_actor                         -270.64229
memory_size                        457044.111 

=== epoch 6/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.82it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   93
episode_length                      106.16129
returns                            -36.003142
return_std                          90.771682
average_reward                      -0.337467
round_time             0 days 00:06:19.234144
episodes_test                            14.0
episode_length_test                663.071429
returns_test                       543.101683
return_std_test                    349.732268
average_reward_test                   0.81381
round_time_test        0 days 00:00:10.638702
round_time_total       0 days 00:06:19.235426
loss_total                           915.2343
loss_critic                       1211.427641
loss_actor                        -269.539145
memory_size                       458486.3825 

=== epoch 6/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:56,  4.78it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   98
episode_length                      96.428571
returns                            -31.531648
return_std                          74.695465
average_reward                      -0.335908
round_time             0 days 00:06:19.482578
episodes_test                            16.0
episode_length_test                  574.9375
returns_test                       440.076986
return_std_test                    329.843841
average_reward_test                  0.757142
round_time_test        0 days 00:00:10.954958
round_time_total       0 days 00:06:19.483726
loss_total                         917.981828
loss_critic                       1214.920329
loss_actor                        -269.772263
memory_size                        460271.768 

=== epoch 6/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.29it/s]
episodes                                   98
episode_length                      91.387755
returns                            -32.306222
return_std                          76.824493
average_reward                      -0.366165
round_time             0 days 00:06:18.553884
episodes_test                            14.0
episode_length_test                664.571429
returns_test                       513.830728
return_std_test                    341.669132
average_reward_test                  0.777856
round_time_test        0 days 00:00:11.042077
round_time_total       0 days 00:06:18.555266
loss_total                         906.643591
loss_critic                       1200.736016
loss_actor                        -269.726187
memory_size                       462024.3695 

=== epoch 6/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   94
episode_length                     101.393617
returns                            -37.507137
return_std                          87.363908
average_reward                        -0.3692
round_time             0 days 00:06:22.711928
episodes_test                            18.0
episode_length_test                552.722222
returns_test                       398.241856
return_std_test                    336.938962
average_reward_test                  0.722091
round_time_test        0 days 00:00:11.094881
round_time_total       0 days 00:06:22.713163
loss_total                         885.404507
loss_critic                       1174.299509
loss_actor                        -270.175583
memory_size                        463848.605 

=== epoch 6/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   66
episode_length                     139.984848
returns                             -50.98655
return_std                         108.726998
average_reward                      -0.367394
round_time             0 days 00:06:22.902714
episodes_test                            23.0
episode_length_test                430.695652
returns_test                       329.826387
return_std_test                    333.265242
average_reward_test                  0.762796
round_time_test        0 days 00:00:10.905618
round_time_total       0 days 00:06:22.903825
loss_total                          915.88786
loss_critic                       1212.346139
loss_actor                        -269.945339
memory_size                       465666.7535 

=== epoch 6/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:24,  5.17it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   62
episode_length                      160.33871
returns                            -62.417348
return_std                         122.464238
average_reward                      -0.389262
round_time             0 days 00:06:22.646254
episodes_test                            19.0
episode_length_test                480.947368
returns_test                       328.633871
return_std_test                     326.18373
average_reward_test                  0.676055
round_time_test        0 days 00:00:10.752525
round_time_total       0 days 00:06:22.647540
loss_total                         921.405471
loss_critic                         1219.1179
loss_actor                        -269.444325
memory_size                        467501.307 

=== epoch 6/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   67
episode_length                     134.208955
returns                             -50.28908
return_std                         108.738924
average_reward                      -0.378165
round_time             0 days 00:06:23.294413
episodes_test                            22.0
episode_length_test                452.590909
returns_test                       336.097689
return_std_test                     332.52641
average_reward_test                  0.742488
round_time_test        0 days 00:00:10.684207
round_time_total       0 days 00:06:23.295500
loss_total                         909.622206
loss_critic                       1204.533674
loss_actor                         -270.02375
memory_size                        469036.474 

=== epoch 6/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   78
episode_length                     125.307692
returns                            -44.171449
return_std                          98.928256
average_reward                      -0.354098
round_time             0 days 00:06:26.074124
episodes_test                            14.0
episode_length_test                689.571429
returns_test                         436.7818
return_std_test                    328.144535
average_reward_test                  0.642127
round_time_test        0 days 00:00:11.003852
round_time_total       0 days 00:06:26.075221
loss_total                         903.095006
loss_critic                       1196.527017
loss_actor                        -270.633118
memory_size                        470863.396 

=== epoch 6/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.22it/s]
episodes                                   83
episode_length                     108.180723
returns                            -37.564309
return_std                          91.284229
average_reward                      -0.351528
round_time             0 days 00:06:23.507527
episodes_test                            12.0
episode_length_test                    780.75
returns_test                       467.545883
return_std_test                    342.182709
average_reward_test                  0.591896
round_time_test        0 days 00:00:11.020351
round_time_total       0 days 00:06:23.508622
loss_total                         919.739974
loss_critic                       1217.245455
loss_actor                        -270.282033
memory_size                        472411.541 

=== epoch 6/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:34,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.15it/s]
episodes                                   84
episode_length                      115.52381
returns                            -40.725419
return_std                          98.540393
average_reward                      -0.353705
round_time             0 days 00:06:28.695848
episodes_test                            18.0
episode_length_test                552.666667
returns_test                       419.969359
return_std_test                    367.049829
average_reward_test                  0.758127
round_time_test        0 days 00:00:10.860638
round_time_total       0 days 00:06:28.697075
loss_total                         897.033522
loss_critic                       1189.013179
loss_actor                        -270.885187
memory_size                        474291.746 

=== epoch 6/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   83
episode_length                     117.313253
returns                             -42.07306
return_std                          98.992069
average_reward                       -0.35702
round_time             0 days 00:06:28.165172
episodes_test                            18.0
episode_length_test                530.277778
returns_test                       386.821048
return_std_test                    349.834102
average_reward_test                  0.721964
round_time_test        0 days 00:00:11.008530
round_time_total       0 days 00:06:28.166294
loss_total                         909.669947
loss_critic                       1204.834904
loss_actor                        -270.989967
memory_size                       476035.0305 

=== epoch 6/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.82it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   70
episode_length                          117.9
returns                            -41.914247
return_std                         101.924512
average_reward                      -0.364121
round_time             0 days 00:06:28.070009
episodes_test                            14.0
episode_length_test                649.928571
returns_test                       382.456678
return_std_test                    343.940923
average_reward_test                  0.614516
round_time_test        0 days 00:00:10.922821
round_time_total       0 days 00:06:28.071101
loss_total                         907.783757
loss_critic                       1202.411421
loss_actor                        -270.726985
memory_size                        477841.583 

=== epoch 6/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.15it/s]
episodes                                   68
episode_length                          144.0
returns                            -50.127435
return_std                         117.415824
average_reward                      -0.348541
round_time             0 days 00:06:29.181601
episodes_test                            18.0
episode_length_test                535.555556
returns_test                        401.51064
return_std_test                    351.946695
average_reward_test                  0.755618
round_time_test        0 days 00:00:10.725528
round_time_total       0 days 00:06:29.182717
loss_total                         906.933747
loss_critic                       1201.340747
loss_actor                        -270.694338
memory_size                        479585.202 

=== epoch 6/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:28<00:00,  5.14it/s]
episodes                                   60
episode_length                     142.133333
returns                            -48.881022
return_std                         114.140359
average_reward                      -0.352319
round_time             0 days 00:06:29.443302
episodes_test                            19.0
episode_length_test                519.052632
returns_test                       394.248244
return_std_test                    339.260947
average_reward_test                  0.759367
round_time_test        0 days 00:00:11.037608
round_time_total       0 days 00:06:29.444508
loss_total                         905.999394
loss_critic                       1200.410661
loss_actor                         -271.64576
memory_size                       481418.1325 

=== epoch 6/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   64
episode_length                     155.140625
returns                            -54.053828
return_std                         116.676727
average_reward                      -0.346543
round_time             0 days 00:06:33.231734
episodes_test                            18.0
episode_length_test                515.055556
returns_test                       411.562079
return_std_test                    356.751997
average_reward_test                  0.783946
round_time_test        0 days 00:00:10.812030
round_time_total       0 days 00:06:33.232873
loss_total                         906.146827
loss_critic                       1200.659226
loss_actor                        -271.902857
memory_size                        483213.568 

=== epoch 6/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:10,  4.63it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   65
episode_length                     138.923077
returns                            -47.048973
return_std                         107.892184
average_reward                      -0.344663
round_time             0 days 00:06:31.266212
episodes_test                            19.0
episode_length_test                522.736842
returns_test                       401.434468
return_std_test                    373.843408
average_reward_test                  0.767633
round_time_test        0 days 00:00:10.985673
round_time_total       0 days 00:06:31.267350
loss_total                         896.949338
loss_critic                        1189.18406
loss_actor                        -271.989631
memory_size                        484914.427 

=== epoch 6/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:55,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   87
episode_length                     114.413793
returns                            -40.412735
return_std                          92.940695
average_reward                      -0.353186
round_time             0 days 00:06:32.017191
episodes_test                            20.0
episode_length_test                     483.8
returns_test                        375.63236
return_std_test                    308.383776
average_reward_test                  0.768855
round_time_test        0 days 00:00:10.727042
round_time_total       0 days 00:06:32.018711
loss_total                         931.173948
loss_critic                       1232.151543
loss_actor                        -272.736515
memory_size                        486677.524 

=== epoch 6/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.06it/s]
episodes                                   85
episode_length                     105.352941
returns                            -38.794887
return_std                          86.468027
average_reward                      -0.373164
round_time             0 days 00:06:35.727797
episodes_test                            21.0
episode_length_test                 458.47619
returns_test                        317.52697
return_std_test                    282.348699
average_reward_test                  0.694984
round_time_test        0 days 00:00:10.798337
round_time_total       0 days 00:06:35.728888
loss_total                         904.002517
loss_critic                       1198.009658
loss_actor                        -272.026132
memory_size                        488215.955 

=== epoch 6/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   81
episode_length                     111.469136
returns                            -41.644674
return_std                          90.151949
average_reward                      -0.377953
round_time             0 days 00:06:34.301613
episodes_test                            29.0
episode_length_test                337.517241
returns_test                       247.285656
return_std_test                    304.720069
average_reward_test                  0.734835
round_time_test        0 days 00:00:10.809792
round_time_total       0 days 00:06:34.302705
loss_total                          915.68973
loss_critic                       1212.617384
loss_actor                        -272.020965
memory_size                        490014.609 

=== epoch 6/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:22,  5.20it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   85
episode_length                     108.788235
returns                            -41.025229
return_std                          91.611585
average_reward                      -0.375414
round_time             0 days 00:06:37.092537
episodes_test                            14.0
episode_length_test                649.142857
returns_test                       515.988247
return_std_test                    352.809638
average_reward_test                  0.805767
round_time_test        0 days 00:00:10.905418
round_time_total       0 days 00:06:37.093639
loss_total                         893.593721
loss_critic                       1185.152941
loss_actor                        -272.643235
memory_size                       491753.5355 

=== epoch 6/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:51,  4.85it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   80
episode_length                       114.8875
returns                             -43.61597
return_std                          93.592403
average_reward                      -0.380419
round_time             0 days 00:06:34.698337
episodes_test                            16.0
episode_length_test                  566.1875
returns_test                       445.557642
return_std_test                    369.964632
average_reward_test                  0.796998
round_time_test        0 days 00:00:11.119650
round_time_total       0 days 00:06:34.699435
loss_total                         901.697197
loss_critic                       1195.343387
loss_actor                        -272.887646
memory_size                        493554.671 

=== epoch 6/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:58,  4.76it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.05it/s]
episodes                                   68
episode_length                     143.132353
returns                            -51.429591
return_std                         110.480621
average_reward                      -0.362156
round_time             0 days 00:06:36.523931
episodes_test                            18.0
episode_length_test                540.611111
returns_test                       438.154659
return_std_test                    343.033778
average_reward_test                  0.813228
round_time_test        0 days 00:00:10.832972
round_time_total       0 days 00:06:36.525330
loss_total                         908.957329
loss_critic                       1204.487497
loss_actor                         -273.16343
memory_size                       495259.1875 

=== epoch 6/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:17,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.05it/s]
episodes                                   64
episode_length                      138.15625
returns                            -49.788473
return_std                         106.973334
average_reward                      -0.365522
round_time             0 days 00:06:36.865930
episodes_test                            14.0
episode_length_test                678.071429
returns_test                       500.512465
return_std_test                    315.386842
average_reward_test                  0.749112
round_time_test        0 days 00:00:10.885957
round_time_total       0 days 00:06:36.867035
loss_total                          913.86883
loss_critic                       1210.855944
loss_actor                        -274.079707
memory_size                        497096.329 

=== epoch 6/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   76
episode_length                     124.618421
returns                            -43.809587
return_std                         100.652166
average_reward                      -0.353252
round_time             0 days 00:06:39.979484
episodes_test                            16.0
episode_length_test                  610.6875
returns_test                       481.693439
return_std_test                    382.669614
average_reward_test                  0.790294
round_time_test        0 days 00:00:10.926761
round_time_total       0 days 00:06:39.980614
loss_total                         927.846639
loss_critic                       1228.250564
loss_actor                        -273.769146
memory_size                       498857.3265 

=== epoch 6/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   81
episode_length                     116.666667
returns                            -41.598813
return_std                          99.529008
average_reward                      -0.359684
round_time             0 days 00:06:38.969199
episodes_test                            26.0
episode_length_test                367.346154
returns_test                         299.4029
return_std_test                    301.901331
average_reward_test                  0.819381
round_time_test        0 days 00:00:10.786064
round_time_total       0 days 00:06:38.970332
loss_total                         908.068807
loss_critic                       1203.678404
loss_actor                        -274.369667
memory_size                       500551.2385 

=== epoch 6/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:35,  4.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   91
episode_length                     107.802198
returns                             -36.35117
return_std                          94.360338
average_reward                      -0.340128
round_time             0 days 00:06:41.111517
episodes_test                            13.0
episode_length_test                730.615385
returns_test                        611.41677
return_std_test                    354.234092
average_reward_test                  0.832554
round_time_test        0 days 00:00:10.975017
round_time_total       0 days 00:06:41.112754
loss_total                         911.605191
loss_critic                       1207.881279
loss_actor                        -273.499243
memory_size                       502191.2315 

=== epoch 6/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   80
episode_length                       121.6875
returns                            -43.765891
return_std                         105.127692
average_reward                      -0.360027
round_time             0 days 00:06:40.129080
episodes_test                            13.0
episode_length_test                717.230769
returns_test                       540.384747
return_std_test                      284.6146
average_reward_test                  0.757211
round_time_test        0 days 00:00:10.699289
round_time_total       0 days 00:06:40.130291
loss_total                         921.554791
loss_critic                       1220.408047
loss_actor                        -273.858314
memory_size                        503928.944 

=== epoch 6/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   69
episode_length                     135.985507
returns                            -48.598515
return_std                         111.664674
average_reward                      -0.362697
round_time             0 days 00:06:41.962247
episodes_test                            19.0
episode_length_test                511.368421
returns_test                       363.618914
return_std_test                    312.172864
average_reward_test                  0.717002
round_time_test        0 days 00:00:10.562856
round_time_total       0 days 00:06:41.963377
loss_total                         906.398294
loss_critic                       1201.432021
loss_actor                        -273.736698
memory_size                        505840.473 

=== epoch 6/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:44,  4.29it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   68
episode_length                     136.132353
returns                            -49.132898
return_std                         112.000171
average_reward                      -0.366581
round_time             0 days 00:06:42.317436
episodes_test                            20.0
episode_length_test                    488.65
returns_test                       328.864112
return_std_test                    287.279734
average_reward_test                  0.673962
round_time_test        0 days 00:00:10.731673
round_time_total       0 days 00:06:42.318588
loss_total                         896.332981
loss_critic                       1188.777956
loss_actor                        -273.447002
memory_size                        507656.311 

=== epoch 6/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:23,  4.50it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   59
episode_length                     150.915254
returns                            -56.257337
return_std                         120.011525
average_reward                      -0.370745
round_time             0 days 00:06:43.935792
episodes_test                            12.0
episode_length_test                825.916667
returns_test                       593.533745
return_std_test                    271.059621
average_reward_test                  0.723329
round_time_test        0 days 00:00:10.970590
round_time_total       0 days 00:06:43.936877
loss_total                         905.524348
loss_critic                        1200.54043
loss_actor                        -274.540064
memory_size                        509469.454 

=== epoch 6/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:47,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   50
episode_length                         172.78
returns                            -70.486315
return_std                         131.786052
average_reward                      -0.405515
round_time             0 days 00:06:42.411340
episodes_test                            15.0
episode_length_test                645.933333
returns_test                       453.840399
return_std_test                    322.656409
average_reward_test                  0.700517
round_time_test        0 days 00:00:10.850704
round_time_total       0 days 00:06:42.412442
loss_total                         915.032181
loss_critic                       1212.685974
loss_actor                        -275.583067
memory_size                        511190.197 

=== epoch 6/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:28,  5.13it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   47
episode_length                     181.978723
returns                            -71.912925
return_std                         132.796728
average_reward                      -0.393626
round_time             0 days 00:06:43.043449
episodes_test                            15.0
episode_length_test                     639.4
returns_test                       441.525439
return_std_test                     304.07539
average_reward_test                  0.688348
round_time_test        0 days 00:00:10.827638
round_time_total       0 days 00:06:43.044969
loss_total                         908.957673
loss_critic                       1205.142295
loss_actor                          -275.7809
memory_size                        513077.719 

=== epoch 6/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   54
episode_length                     184.240741
returns                            -71.337784
return_std                         135.035897
average_reward                      -0.388837
round_time             0 days 00:06:42.384850
episodes_test                            16.0
episode_length_test                  568.3125
returns_test                       460.657017
return_std_test                    341.641895
average_reward_test                  0.820751
round_time_test        0 days 00:00:10.895311
round_time_total       0 days 00:06:42.386159
loss_total                          898.03429
loss_critic                       1191.357886
loss_actor                        -275.260184
memory_size                        514972.362 

=== epoch 6/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:25,  4.48it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.95it/s]
episodes                                   52
episode_length                     168.153846
returns                             -65.23174
return_std                         131.538988
average_reward                      -0.388092
round_time             0 days 00:06:44.503533
episodes_test                            18.0
episode_length_test                539.833333
returns_test                       363.773465
return_std_test                    312.370303
average_reward_test                  0.675687
round_time_test        0 days 00:00:10.709874
round_time_total       0 days 00:06:44.504630
loss_total                         902.218567
loss_critic                       1197.015306
loss_actor                        -276.968471
memory_size                       516821.5385 

=== epoch 6/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 6/2000 [00:01<07:52,  4.22it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   46
episode_length                     201.847826
returns                              -79.6223
return_std                         145.231001
average_reward                      -0.396603
round_time             0 days 00:06:43.947076
episodes_test                            18.0
episode_length_test                534.222222
returns_test                       427.751076
return_std_test                    367.560262
average_reward_test                  0.802198
round_time_test        0 days 00:00:10.940283
round_time_total       0 days 00:06:43.948169
loss_total                          912.89415
loss_critic                       1210.406866
loss_actor                        -277.156787
memory_size                        518662.631 

=== epoch 6/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.85it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   39
episode_length                     222.230769
returns                            -82.325824
return_std                         148.895177
average_reward                      -0.377236
round_time             0 days 00:06:43.974328
episodes_test                            21.0
episode_length_test                436.142857
returns_test                       355.351428
return_std_test                     342.16701
average_reward_test                  0.809208
round_time_test        0 days 00:00:10.970231
round_time_total       0 days 00:06:43.975449
loss_total                         901.245113
loss_critic                       1196.148941
loss_actor                        -278.370278
memory_size                       520473.4965 

=== epoch 6/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:59,  4.75it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   44
episode_length                     200.863636
returns                            -77.837181
return_std                         146.471412
average_reward                      -0.393242
round_time             0 days 00:06:45.370286
episodes_test                            14.0
episode_length_test                712.928571
returns_test                       521.415469
return_std_test                    327.483394
average_reward_test                  0.732397
round_time_test        0 days 00:00:10.824370
round_time_total       0 days 00:06:45.371386
loss_total                         890.074649
loss_critic                       1182.335257
loss_actor                        -278.967872
memory_size                        522323.865 

=== epoch 6/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   48
episode_length                        179.875
returns                            -69.265284
return_std                         131.961309
average_reward                      -0.393601
round_time             0 days 00:06:45.974409
episodes_test                            16.0
episode_length_test                  604.8125
returns_test                       452.795702
return_std_test                    353.428314
average_reward_test                  0.751877
round_time_test        0 days 00:00:11.055122
round_time_total       0 days 00:06:45.975533
loss_total                         891.378549
loss_critic                       1183.924742
loss_actor                        -278.806307
memory_size                       524167.3435 

=== epoch 6/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:44,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:47<00:00,  4.91it/s]
episodes                                   53
episode_length                     166.471698
returns                            -63.704019
return_std                         127.062866
average_reward                      -0.387759
round_time             0 days 00:06:47.903042
episodes_test                            10.0
episode_length_test                     921.5
returns_test                       619.217597
return_std_test                    244.757817
average_reward_test                  0.662056
round_time_test        0 days 00:00:11.002948
round_time_total       0 days 00:06:47.904177
loss_total                         879.355081
loss_critic                       1169.052516
loss_actor                        -279.434746
memory_size                       525925.1355 

=== epoch 6/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   55
episode_length                     163.363636
returns                            -60.721526
return_std                         128.474478
average_reward                      -0.366805
round_time             0 days 00:06:46.190673
episodes_test                            15.0
episode_length_test                604.866667
returns_test                       448.212935
return_std_test                    334.040291
average_reward_test                  0.756433
round_time_test        0 days 00:00:10.949455
round_time_total       0 days 00:06:46.191784
loss_total                         903.590412
loss_critic                       1199.343473
loss_actor                        -279.421924
memory_size                        527658.911 

=== epoch 6/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.96it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   63
episode_length                     150.174603
returns                            -54.779423
return_std                         119.151892
average_reward                      -0.368356
round_time             0 days 00:06:46.409942
episodes_test                            20.0
episode_length_test                     477.4
returns_test                       323.743615
return_std_test                     275.56118
average_reward_test                  0.674095
round_time_test        0 days 00:00:10.870130
round_time_total       0 days 00:06:46.411042
loss_total                         903.272652
loss_critic                       1199.034762
loss_actor                        -279.775871
memory_size                       529434.4325 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 7/10 ===== round 1/50 ======================================
  1%|          | 12/2000 [00:02<06:34,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:04<00:00,  5.49it/s]
episodes                                   14
episode_length                     124.428571
returns                            -58.452841
return_std                          94.437538
average_reward                      -0.460102
round_time             0 days 00:06:04.712258
episodes_test                            18.0
episode_length_test                537.388889
returns_test                       365.437688
return_std_test                      265.2186
average_reward_test                  0.673701
round_time_test        0 days 00:00:10.675447
round_time_total       0 days 00:06:04.713379
loss_total                         900.728126
loss_critic                        1195.74447
loss_actor                        -279.337339
memory_size                       531195.7115 

=== epoch 7/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<05:53,  5.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:04<00:00,  5.48it/s]
episodes                                   20
episode_length                          199.8
returns                            -86.406887
return_std                         134.001035
average_reward                      -0.432361
round_time             0 days 00:06:05.301610
episodes_test                            16.0
episode_length_test                   618.625
returns_test                       493.630623
return_std_test                    347.240623
average_reward_test                  0.800369
round_time_test        0 days 00:00:11.039639
round_time_total       0 days 00:06:05.302706
loss_total                         901.205246
loss_critic                       1196.264985
loss_actor                        -279.033793
memory_size                        533050.164 

=== epoch 7/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:42,  4.95it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:07<00:00,  5.44it/s]
episodes                                   27
episode_length                     199.444444
returns                            -84.260126
return_std                         129.297644
average_reward                      -0.423917
round_time             0 days 00:06:07.971724
episodes_test                            17.0
episode_length_test                581.941176
returns_test                       350.583129
return_std_test                    234.524743
average_reward_test                  0.604942
round_time_test        0 days 00:00:10.652406
round_time_total       0 days 00:06:07.972829
loss_total                         892.087532
loss_critic                       1184.875838
loss_actor                         -279.06577
memory_size                       534936.7145 

=== epoch 7/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:09<00:00,  5.41it/s]
episodes                                   36
episode_length                     218.583333
returns                            -92.854057
return_std                         139.741919
average_reward                      -0.416471
round_time             0 days 00:06:10.514673
episodes_test                            10.0
episode_length_test                     901.2
returns_test                       690.235194
return_std_test                    252.448562
average_reward_test                  0.762997
round_time_test        0 days 00:00:10.827404
round_time_total       0 days 00:06:10.515913
loss_total                         901.132669
loss_critic                       1196.394936
loss_actor                        -279.916481
memory_size                         536835.39 

=== epoch 7/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:37,  5.02it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:11<00:00,  5.38it/s]
episodes                                   40
episode_length                        228.825
returns                            -89.373746
return_std                         137.863665
average_reward                      -0.390411
round_time             0 days 00:06:12.495952
episodes_test                            16.0
episode_length_test                   592.875
returns_test                       364.837131
return_std_test                    304.631892
average_reward_test                  0.611795
round_time_test        0 days 00:00:10.803709
round_time_total       0 days 00:06:12.497051
loss_total                         897.885148
loss_critic                       1192.374174
loss_actor                         -280.07103
memory_size                       538684.4755 

=== epoch 7/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:19,  5.24it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.34it/s]
episodes                                   31
episode_length                          276.0
returns                           -100.218604
return_std                         149.208011
average_reward                      -0.376563
round_time             0 days 00:06:14.867320
episodes_test                            15.0
episode_length_test                     655.4
returns_test                       446.086191
return_std_test                    300.204834
average_reward_test                  0.686511
round_time_test        0 days 00:00:10.901798
round_time_total       0 days 00:06:14.868410
loss_total                         889.568771
loss_critic                       1181.922374
loss_actor                        -279.845721
memory_size                        540576.693 

=== epoch 7/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:55,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.35it/s]
episodes                                   43
episode_length                     211.790698
returns                            -79.450392
return_std                         136.227857
average_reward                      -0.375515
round_time             0 days 00:06:14.168822
episodes_test                            12.0
episode_length_test                768.333333
returns_test                       530.096876
return_std_test                    264.475135
average_reward_test                  0.704161
round_time_test        0 days 00:00:10.789437
round_time_total       0 days 00:06:14.169912
loss_total                         893.437773
loss_critic                       1186.804132
loss_actor                        -280.027748
memory_size                       542348.0325 

=== epoch 7/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:13,  5.33it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   61
episode_length                     151.786885
returns                            -51.900782
return_std                         110.953661
average_reward                      -0.349708
round_time             0 days 00:06:13.241065
episodes_test                            14.0
episode_length_test                696.571429
returns_test                       446.689005
return_std_test                    258.431478
average_reward_test                  0.638438
round_time_test        0 days 00:00:10.842140
round_time_total       0 days 00:06:13.242231
loss_total                         898.771896
loss_critic                       1193.642728
loss_actor                        -280.711514
memory_size                       543966.9505 

=== epoch 7/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:51,  4.84it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.31it/s]
episodes                                   60
episode_length                          151.6
returns                             -53.00621
return_std                         121.757782
average_reward                      -0.342783
round_time             0 days 00:06:16.916915
episodes_test                            10.0
episode_length_test                     908.3
returns_test                       677.182121
return_std_test                    236.808775
average_reward_test                  0.733174
round_time_test        0 days 00:00:11.026635
round_time_total       0 days 00:06:16.918010
loss_total                         892.644647
loss_critic                       1186.097649
loss_actor                        -281.167449
memory_size                       545758.9815 

=== epoch 7/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:48,  4.87it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   58
episode_length                     169.172414
returns                            -62.628581
return_std                         132.188478
average_reward                      -0.369813
round_time             0 days 00:06:19.162012
episodes_test                            12.0
episode_length_test                     754.0
returns_test                       577.681423
return_std_test                    308.383503
average_reward_test                  0.749153
round_time_test        0 days 00:00:10.959605
round_time_total       0 days 00:06:19.163241
loss_total                         904.373714
loss_critic                       1200.865276
loss_actor                        -281.592617
memory_size                        547675.023 

=== epoch 7/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:43,  4.94it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   68
episode_length                     140.985294
returns                            -49.891502
return_std                         113.290906
average_reward                      -0.359945
round_time             0 days 00:06:19.434512
episodes_test                            12.0
episode_length_test                779.916667
returns_test                       606.274478
return_std_test                    271.709126
average_reward_test                  0.769561
round_time_test        0 days 00:00:10.682317
round_time_total       0 days 00:06:19.435683
loss_total                         892.743057
loss_critic                       1186.173079
loss_actor                         -280.97712
memory_size                       549453.1465 

=== epoch 7/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   54
episode_length                     168.203704
returns                             -60.57079
return_std                          125.70955
average_reward                      -0.362504
round_time             0 days 00:06:22.939770
episodes_test                            12.0
episode_length_test                755.333333
returns_test                       541.992403
return_std_test                     342.81701
average_reward_test                   0.70189
round_time_test        0 days 00:00:10.865478
round_time_total       0 days 00:06:22.940870
loss_total                         903.182935
loss_critic                       1199.212557
loss_actor                        -280.935635
memory_size                       551330.1695 

=== epoch 7/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:03,  4.70it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   54
episode_length                      165.37037
returns                            -61.231864
return_std                         125.043496
average_reward                      -0.370469
round_time             0 days 00:06:22.227355
episodes_test                            16.0
episode_length_test                  615.3125
returns_test                       453.483374
return_std_test                    338.814055
average_reward_test                  0.739914
round_time_test        0 days 00:00:11.014221
round_time_total       0 days 00:06:22.228460
loss_total                         910.575397
loss_critic                       1208.386388
loss_actor                        -280.668647
memory_size                        553082.726 

=== epoch 7/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:52,  4.82it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   60
episode_length                         139.95
returns                            -50.348551
return_std                         101.542761
average_reward                      -0.366408
round_time             0 days 00:06:22.801232
episodes_test                            17.0
episode_length_test                576.588235
returns_test                       375.794189
return_std_test                    294.810083
average_reward_test                  0.653884
round_time_test        0 days 00:00:10.997333
round_time_total       0 days 00:06:22.802363
loss_total                         904.721694
loss_critic                         1201.0214
loss_actor                        -280.477212
memory_size                        554783.866 

=== epoch 7/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:33,  4.40it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   70
episode_length                     139.985714
returns                            -48.237029
return_std                          99.187905
average_reward                      -0.339099
round_time             0 days 00:06:23.836942
episodes_test                            18.0
episode_length_test                545.888889
returns_test                       383.159993
return_std_test                    313.253384
average_reward_test                  0.697113
round_time_test        0 days 00:00:10.964888
round_time_total       0 days 00:06:23.838052
loss_total                         900.028135
loss_critic                       1195.289801
loss_actor                        -281.018609
memory_size                        556589.904 

=== epoch 7/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   66
episode_length                     146.606061
returns                            -50.535213
return_std                         108.385819
average_reward                      -0.346803
round_time             0 days 00:06:26.225875
episodes_test                            19.0
episode_length_test                524.210526
returns_test                       398.702061
return_std_test                    345.930279
average_reward_test                  0.758146
round_time_test        0 days 00:00:10.763045
round_time_total       0 days 00:06:26.227001
loss_total                         898.026451
loss_critic                       1192.697482
loss_actor                         -280.65775
memory_size                        558369.605 

=== epoch 7/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   67
episode_length                      144.58209
returns                            -47.788135
return_std                         103.143321
average_reward                      -0.330548
round_time             0 days 00:06:25.691150
episodes_test                            11.0
episode_length_test                     845.0
returns_test                       576.276254
return_std_test                     282.36509
average_reward_test                  0.680726
round_time_test        0 days 00:00:10.991621
round_time_total       0 days 00:06:25.692283
loss_total                         910.292005
loss_critic                       1207.946222
loss_actor                         -280.32494
memory_size                       560231.3665 

=== epoch 7/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.96it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   54
episode_length                     161.722222
returns                            -54.204613
return_std                         113.970983
average_reward                       -0.32808
round_time             0 days 00:06:23.677900
episodes_test                            11.0
episode_length_test                836.636364
returns_test                       573.682807
return_std_test                    255.363113
average_reward_test                  0.680819
round_time_test        0 days 00:00:10.832924
round_time_total       0 days 00:06:23.679188
loss_total                         918.529897
loss_critic                       1218.393207
loss_actor                        -280.923429
memory_size                       562050.1965 

=== epoch 7/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:25,  5.16it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   65
episode_length                          137.6
returns                            -41.982585
return_std                         104.268442
average_reward                      -0.319847
round_time             0 days 00:06:27.317866
episodes_test                            15.0
episode_length_test                621.466667
returns_test                       365.733647
return_std_test                    263.750887
average_reward_test                  0.587983
round_time_test        0 days 00:00:10.934121
round_time_total       0 days 00:06:27.318982
loss_total                          911.58683
loss_critic                       1209.685507
loss_actor                        -280.807957
memory_size                        563739.362 

=== epoch 7/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:30,  4.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   61
episode_length                     156.918033
returns                            -53.809295
return_std                         124.481873
average_reward                      -0.350211
round_time             0 days 00:06:28.343622
episodes_test                            17.0
episode_length_test                551.176471
returns_test                       402.088067
return_std_test                    328.689834
average_reward_test                  0.721889
round_time_test        0 days 00:00:10.930543
round_time_total       0 days 00:06:28.344725
loss_total                           896.2225
loss_critic                       1190.676175
loss_actor                        -281.592281
memory_size                       565495.4695 

=== epoch 7/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:38,  4.34it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.13it/s]
episodes                                   69
episode_length                     131.405797
returns                            -45.818828
return_std                         111.695598
average_reward                      -0.347795
round_time             0 days 00:06:30.365022
episodes_test                            15.0
episode_length_test                606.533333
returns_test                       407.262593
return_std_test                    249.496284
average_reward_test                  0.655599
round_time_test        0 days 00:00:10.879006
round_time_total       0 days 00:06:30.366536
loss_total                         895.678229
loss_critic                       1190.094809
loss_actor                        -281.988167
memory_size                        567304.768 

=== epoch 7/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   70
episode_length                     141.771429
returns                            -49.463298
return_std                         122.203713
average_reward                      -0.347621
round_time             0 days 00:06:31.311008
episodes_test                            18.0
episode_length_test                     551.5
returns_test                       380.072632
return_std_test                    322.469951
average_reward_test                  0.687118
round_time_test        0 days 00:00:10.648750
round_time_total       0 days 00:06:31.312114
loss_total                         903.464639
loss_critic                       1199.944106
loss_actor                        -282.453308
memory_size                        569085.275 

=== epoch 7/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:01,  4.73it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   76
episode_length                     124.828947
returns                            -45.525283
return_std                         115.949124
average_reward                      -0.367265
round_time             0 days 00:06:31.521901
episodes_test                            10.0
episode_length_test                     901.4
returns_test                       630.731857
return_std_test                     265.09738
average_reward_test                  0.695322
round_time_test        0 days 00:00:10.858316
round_time_total       0 days 00:06:31.523019
loss_total                         916.512963
loss_critic                       1216.279784
loss_actor                        -282.554403
memory_size                       570799.1635 

=== epoch 7/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:38,  5.00it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   59
episode_length                     164.254237
returns                            -60.452613
return_std                         129.585076
average_reward                      -0.370367
round_time             0 days 00:06:31.407742
episodes_test                            19.0
episode_length_test                510.473684
returns_test                       367.705254
return_std_test                    328.107749
average_reward_test                  0.724403
round_time_test        0 days 00:00:10.674349
round_time_total       0 days 00:06:31.408967
loss_total                         907.184154
loss_critic                       1204.574343
loss_actor                        -282.376681
memory_size                       572661.4425 

=== epoch 7/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:02,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.10it/s]
episodes                                   61
episode_length                     145.065574
returns                            -51.736791
return_std                         116.979916
average_reward                      -0.362702
round_time             0 days 00:06:32.634320
episodes_test                            12.0
episode_length_test                810.166667
returns_test                       692.383388
return_std_test                    313.339807
average_reward_test                   0.85557
round_time_test        0 days 00:00:10.816965
round_time_total       0 days 00:06:32.635403
loss_total                         904.291142
loss_critic                       1200.845671
loss_actor                        -281.927052
memory_size                        574495.146 

=== epoch 7/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.08it/s]
episodes                                   58
episode_length                     145.068966
returns                              -49.3078
return_std                         119.910925
average_reward                      -0.343249
round_time             0 days 00:06:34.559924
episodes_test                            17.0
episode_length_test                565.823529
returns_test                       380.318558
return_std_test                    302.155435
average_reward_test                  0.668423
round_time_test        0 days 00:00:10.841582
round_time_total       0 days 00:06:34.561239
loss_total                         908.685678
loss_critic                       1206.395177
loss_actor                          -282.1524
memory_size                       576230.0205 

=== epoch 7/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:17,  4.56it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   59
episode_length                     157.050847
returns                            -54.351794
return_std                         117.744922
average_reward                      -0.343958
round_time             0 days 00:06:33.582906
episodes_test                            14.0
episode_length_test                650.642857
returns_test                       496.487099
return_std_test                    325.302485
average_reward_test                  0.775326
round_time_test        0 days 00:00:10.798633
round_time_total       0 days 00:06:33.584009
loss_total                         896.758913
loss_critic                       1191.636003
loss_actor                        -282.749533
memory_size                       578056.7035 

=== epoch 7/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:36,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.09it/s]
episodes                                   50
episode_length                         180.94
returns                            -62.636303
return_std                         129.829792
average_reward                      -0.347484
round_time             0 days 00:06:33.799211
episodes_test                            15.0
episode_length_test                     658.8
returns_test                       459.010658
return_std_test                    303.150475
average_reward_test                  0.692757
round_time_test        0 days 00:00:11.031429
round_time_total       0 days 00:06:33.800674
loss_total                         891.195943
loss_critic                       1184.922576
loss_actor                        -283.710667
memory_size                        579922.912 

=== epoch 7/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:17,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   60
episode_length                     164.683333
returns                            -56.249908
return_std                         122.617429
average_reward                      -0.339828
round_time             0 days 00:06:37.376632
episodes_test                            15.0
episode_length_test                     666.0
returns_test                        462.71299
return_std_test                    303.340468
average_reward_test                  0.694031
round_time_test        0 days 00:00:10.692958
round_time_total       0 days 00:06:37.377763
loss_total                         903.172954
loss_critic                       1199.972435
loss_actor                        -284.025048
memory_size                        581724.732 

=== epoch 7/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:34,  4.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.06it/s]
episodes                                   57
episode_length                     157.350877
returns                            -51.611805
return_std                         115.538782
average_reward                      -0.331407
round_time             0 days 00:06:36.019194
episodes_test                            14.0
episode_length_test                697.428571
returns_test                       573.360231
return_std_test                    347.778531
average_reward_test                  0.812301
round_time_test        0 days 00:00:10.981802
round_time_total       0 days 00:06:36.020419
loss_total                          912.08013
loss_critic                       1210.975969
loss_actor                        -283.503312
memory_size                        583483.811 

=== epoch 7/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:13,  4.60it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   55
episode_length                     173.818182
returns                            -60.053606
return_std                         119.244543
average_reward                      -0.349089
round_time             0 days 00:06:39.680201
episodes_test                            19.0
episode_length_test                502.105263
returns_test                       427.590189
return_std_test                     328.07731
average_reward_test                  0.842875
round_time_test        0 days 00:00:10.778296
round_time_total       0 days 00:06:39.681293
loss_total                         905.120831
loss_critic                       1202.497926
loss_actor                         -284.38763
memory_size                       585291.2205 

=== epoch 7/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.05it/s]
episodes                                   66
episode_length                     139.666667
returns                            -49.583792
return_std                         109.691183
average_reward                      -0.360101
round_time             0 days 00:06:36.918578
episodes_test                            10.0
episode_length_test                     910.1
returns_test                       790.636383
return_std_test                     195.28615
average_reward_test                  0.859618
round_time_test        0 days 00:00:10.685038
round_time_total       0 days 00:06:36.919659
loss_total                         887.629782
loss_critic                        1180.57324
loss_actor                        -284.144131
memory_size                        587062.662 

=== epoch 7/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:48,  4.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.03it/s]
episodes                                   65
episode_length                     150.938462
returns                            -51.545965
return_std                         114.004196
average_reward                       -0.33905
round_time             0 days 00:06:38.511899
episodes_test                            17.0
episode_length_test                566.411765
returns_test                       338.162909
return_std_test                    291.767824
average_reward_test                  0.608906
round_time_test        0 days 00:00:10.830697
round_time_total       0 days 00:06:38.513202
loss_total                         918.579993
loss_critic                       1219.309607
loss_actor                        -284.338542
memory_size                        588834.483 

=== epoch 7/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   67
episode_length                     137.656716
returns                            -50.183167
return_std                         104.811342
average_reward                      -0.371802
round_time             0 days 00:06:39.019873
episodes_test                            11.0
episode_length_test                     846.0
returns_test                       613.226404
return_std_test                    245.279013
average_reward_test                  0.723174
round_time_test        0 days 00:00:11.119938
round_time_total       0 days 00:06:39.020991
loss_total                         904.998725
loss_critic                       1202.314067
loss_actor                        -284.262724
memory_size                       590528.0725 

=== epoch 7/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   67
episode_length                     135.283582
returns                            -49.220687
return_std                         104.989237
average_reward                       -0.36901
round_time             0 days 00:06:37.247248
episodes_test                            17.0
episode_length_test                584.352941
returns_test                       426.955086
return_std_test                    321.662829
average_reward_test                  0.729239
round_time_test        0 days 00:00:10.801598
round_time_total       0 days 00:06:37.248647
loss_total                          913.36515
loss_critic                       1212.548949
loss_actor                        -283.370127
memory_size                         592333.41 

=== epoch 7/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:58,  4.76it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   61
episode_length                     150.770492
returns                            -52.511508
return_std                         112.812014
average_reward                      -0.357821
round_time             0 days 00:06:40.300069
episodes_test                            10.0
episode_length_test                     908.7
returns_test                       650.334871
return_std_test                    197.501843
average_reward_test                  0.700442
round_time_test        0 days 00:00:10.652294
round_time_total       0 days 00:06:40.301227
loss_total                         931.872055
loss_critic                       1235.737711
loss_actor                        -283.590643
memory_size                       594208.5965 

=== epoch 7/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:32,  4.41it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   50
episode_length                         176.54
returns                            -59.328749
return_std                         118.352266
average_reward                      -0.343327
round_time             0 days 00:06:40.140999
episodes_test                            15.0
episode_length_test                     638.0
returns_test                       458.630771
return_std_test                    325.906483
average_reward_test                  0.718674
round_time_test        0 days 00:00:10.786534
round_time_total       0 days 00:06:40.142088
loss_total                         911.549665
loss_critic                       1210.340327
loss_actor                        -283.613066
memory_size                        596057.361 

=== epoch 7/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:10,  4.63it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   51
episode_length                     190.921569
returns                            -65.775162
return_std                         127.547177
average_reward                      -0.347211
round_time             0 days 00:06:40.138083
episodes_test                            13.0
episode_length_test                696.923077
returns_test                       456.607625
return_std_test                    225.490972
average_reward_test                  0.650949
round_time_test        0 days 00:00:10.797409
round_time_total       0 days 00:06:40.139314
loss_total                         914.912343
loss_critic                       1214.802439
loss_actor                        -284.648123
memory_size                       597933.3045 

=== epoch 7/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:37,  5.01it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   50
episode_length                         192.98
returns                            -64.636446
return_std                         128.427106
average_reward                      -0.332537
round_time             0 days 00:06:42.576478
episodes_test                            14.0
episode_length_test                707.214286
returns_test                       517.559973
return_std_test                    270.632668
average_reward_test                  0.732958
round_time_test        0 days 00:00:10.811104
round_time_total       0 days 00:06:42.577550
loss_total                         913.473316
loss_critic                       1213.128113
loss_actor                        -285.145944
memory_size                        599764.849 

=== epoch 7/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:23,  4.50it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   53
episode_length                     171.207547
returns                            -56.344897
return_std                         116.862262
average_reward                      -0.338765
round_time             0 days 00:06:45.475715
episodes_test                            14.0
episode_length_test                681.571429
returns_test                       465.203611
return_std_test                    304.512391
average_reward_test                  0.671267
round_time_test        0 days 00:00:10.961746
round_time_total       0 days 00:06:45.477014
loss_total                         914.574762
loss_critic                       1214.498308
loss_actor                        -285.119506
memory_size                        601467.898 

=== epoch 7/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:24,  5.18it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   50
episode_length                         196.58
returns                            -66.161815
return_std                         130.759403
average_reward                      -0.337742
round_time             0 days 00:06:42.798321
episodes_test                            14.0
episode_length_test                     687.0
returns_test                       496.623129
return_std_test                     308.84248
average_reward_test                  0.732503
round_time_test        0 days 00:00:10.906444
round_time_total       0 days 00:06:42.799493
loss_total                         912.972207
loss_critic                       1212.730281
loss_actor                        -286.060177
memory_size                       603355.4135 

=== epoch 7/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:30,  4.42it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   61
episode_length                     142.606557
returns                            -48.411802
return_std                         106.403715
average_reward                      -0.346936
round_time             0 days 00:06:44.736095
episodes_test                            20.0
episode_length_test                     481.7
returns_test                       315.233563
return_std_test                    294.441101
average_reward_test                  0.645113
round_time_test        0 days 00:00:11.040488
round_time_total       0 days 00:06:44.737373
loss_total                         926.368889
loss_critic                       1229.356707
loss_actor                        -285.582462
memory_size                       605063.2495 

=== epoch 7/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:35,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   67
episode_length                     136.208955
returns                            -47.207784
return_std                         102.823644
average_reward                      -0.349113
round_time             0 days 00:06:43.202547
episodes_test                            16.0
episode_length_test                  570.6875
returns_test                       441.827969
return_std_test                    320.470285
average_reward_test                  0.786049
round_time_test        0 days 00:00:10.973808
round_time_total       0 days 00:06:43.203698
loss_total                          923.75095
loss_critic                       1225.996462
loss_actor                        -285.231184
memory_size                       606826.5505 

=== epoch 7/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   54
episode_length                     171.777778
returns                            -62.093841
return_std                         127.129617
average_reward                      -0.357646
round_time             0 days 00:06:44.796154
episodes_test                            16.0
episode_length_test                   598.625
returns_test                       336.297313
return_std_test                    248.517687
average_reward_test                  0.573614
round_time_test        0 days 00:00:11.153353
round_time_total       0 days 00:06:44.797482
loss_total                         912.858017
loss_critic                       1212.559817
loss_actor                        -285.949272
memory_size                         608683.83 

=== epoch 7/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:34,  5.05it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   58
episode_length                     162.275862
returns                            -55.514084
return_std                         123.551986
average_reward                      -0.342518
round_time             0 days 00:06:45.315160
episodes_test                            12.0
episode_length_test                782.416667
returns_test                       471.919296
return_std_test                    244.174815
average_reward_test                  0.604671
round_time_test        0 days 00:00:11.056748
round_time_total       0 days 00:06:45.316259
loss_total                         915.222637
loss_critic                       1215.601928
loss_actor                        -286.294611
memory_size                       610529.1845 

=== epoch 7/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   73
episode_length                     136.041096
returns                            -46.906523
return_std                         112.412398
average_reward                      -0.344673
round_time             0 days 00:06:45.460837
episodes_test                            16.0
episode_length_test                  617.0625
returns_test                        499.64558
return_std_test                    290.551125
average_reward_test                  0.804561
round_time_test        0 days 00:00:10.824389
round_time_total       0 days 00:06:45.462114
loss_total                         919.522636
loss_critic                       1221.023608
loss_actor                        -286.481335
memory_size                        612207.653 

=== epoch 7/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:40,  4.33it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.91it/s]
episodes                                   85
episode_length                     110.458824
returns                            -36.886725
return_std                          96.976784
average_reward                       -0.34013
round_time             0 days 00:06:47.470745
episodes_test                            15.0
episode_length_test                634.533333
returns_test                       440.561375
return_std_test                    318.388908
average_reward_test                  0.691795
round_time_test        0 days 00:00:10.915690
round_time_total       0 days 00:06:47.471895
loss_total                          910.73538
loss_critic                       1209.968066
loss_actor                        -286.195445
memory_size                         613733.03 

=== epoch 7/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:57,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   89
episode_length                     104.235955
returns                             -35.17889
return_std                          96.289223
average_reward                      -0.335625
round_time             0 days 00:06:45.936811
episodes_test                            16.0
episode_length_test                  607.8125
returns_test                        373.95671
return_std_test                    251.797593
average_reward_test                  0.622432
round_time_test        0 days 00:00:11.108438
round_time_total       0 days 00:06:45.937985
loss_total                         916.369495
loss_critic                       1217.138282
loss_actor                        -286.705739
memory_size                        615394.963 

=== epoch 7/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:53,  4.82it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   96
episode_length                      93.145833
returns                            -27.020902
return_std                          78.852649
average_reward                       -0.29459
round_time             0 days 00:06:44.907429
episodes_test                            12.0
episode_length_test                769.083333
returns_test                       592.181999
return_std_test                    270.546124
average_reward_test                  0.750092
round_time_test        0 days 00:00:10.823455
round_time_total       0 days 00:06:44.908796
loss_total                         902.190553
loss_critic                       1199.466666
loss_actor                        -286.913982
memory_size                       617100.2425 

=== epoch 7/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:09,  4.07it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   95
episode_length                     102.336842
returns                            -31.295053
return_std                          82.087362
average_reward                      -0.302048
round_time             0 days 00:06:45.882972
episodes_test                            11.0
episode_length_test                849.636364
returns_test                       619.505281
return_std_test                    242.714974
average_reward_test                  0.730955
round_time_test        0 days 00:00:10.715370
round_time_total       0 days 00:06:45.884122
loss_total                         928.727022
loss_critic                       1232.410298
loss_actor                        -286.006164
memory_size                         618869.92 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 8/10 ===== round 1/50 ======================================
  1%|          | 12/2000 [00:02<06:39,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:03<00:00,  5.50it/s]
episodes                                   19
episode_length                     104.157895
returns                            -36.264423
return_std                          83.569383
average_reward                      -0.347058
round_time             0 days 00:06:03.566877
episodes_test                            13.0
episode_length_test                733.461538
returns_test                       539.532378
return_std_test                    265.381687
average_reward_test                  0.739632
round_time_test        0 days 00:00:10.840846
round_time_total       0 days 00:06:03.568185
loss_total                         920.961247
loss_critic                       1222.875426
loss_actor                        -286.695553
memory_size                        620675.046 

=== epoch 8/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.46it/s]
episodes                                   35
episode_length                     114.228571
returns                             -39.73068
return_std                          99.836072
average_reward                      -0.347428
round_time             0 days 00:06:06.530133
episodes_test                            14.0
episode_length_test                702.928571
returns_test                       503.176308
return_std_test                     304.88074
average_reward_test                  0.713545
round_time_test        0 days 00:00:10.773964
round_time_total       0 days 00:06:06.531289
loss_total                         928.173665
loss_critic                       1231.854502
loss_actor                        -286.549762
memory_size                       622344.3635 

=== epoch 8/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:13,  5.33it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.46it/s]
episodes                                   53
episode_length                      97.226415
returns                            -32.161673
return_std                           83.19935
average_reward                      -0.347577
round_time             0 days 00:06:07.203927
episodes_test                            12.0
episode_length_test                     824.0
returns_test                       680.356835
return_std_test                     283.14396
average_reward_test                  0.826058
round_time_test        0 days 00:00:10.779254
round_time_total       0 days 00:06:07.205266
loss_total                         923.910104
loss_critic                       1226.456139
loss_actor                        -286.274125
memory_size                        624003.378 

=== epoch 8/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:13,  4.60it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   70
episode_length                     114.228571
returns                            -40.813271
return_std                           96.21895
average_reward                      -0.357582
round_time             0 days 00:06:13.069284
episodes_test                            12.0
episode_length_test                    771.75
returns_test                       691.441544
return_std_test                    367.630829
average_reward_test                  0.916077
round_time_test        0 days 00:00:11.085598
round_time_total       0 days 00:06:13.070395
loss_total                         924.419368
loss_critic                       1226.862841
loss_actor                        -285.354615
memory_size                       625733.9605 

=== epoch 8/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.99it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.36it/s]
episodes                                   81
episode_length                     119.197531
returns                            -42.681508
return_std                         100.913468
average_reward                      -0.354821
round_time             0 days 00:06:13.876346
episodes_test                            16.0
episode_length_test                    618.25
returns_test                       501.937379
return_std_test                    316.840473
average_reward_test                  0.812076
round_time_test        0 days 00:00:10.781407
round_time_total       0 days 00:06:13.877429
loss_total                         928.115488
loss_critic                       1231.594827
loss_actor                        -285.801956
memory_size                        627546.945 

=== epoch 8/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:28,  4.44it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.35it/s]
episodes                                   64
episode_length                          151.5
returns                            -54.220639
return_std                         116.503886
average_reward                      -0.355292
round_time             0 days 00:06:14.653888
episodes_test                            16.0
episode_length_test                  593.0625
returns_test                        417.72693
return_std_test                    328.355684
average_reward_test                  0.693992
round_time_test        0 days 00:00:10.758237
round_time_total       0 days 00:06:14.655299
loss_total                         947.015645
loss_critic                       1255.281006
loss_actor                        -286.045891
memory_size                       629434.2075 

=== epoch 8/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.34it/s]
episodes                                   54
episode_length                     183.740741
returns                            -65.995127
return_std                         124.921585
average_reward                       -0.35775
round_time             0 days 00:06:15.039082
episodes_test                            14.0
episode_length_test                     645.5
returns_test                       511.751273
return_std_test                    300.120367
average_reward_test                   0.80648
round_time_test        0 days 00:00:10.865703
round_time_total       0 days 00:06:15.040186
loss_total                         944.210333
loss_critic                       1251.821672
loss_actor                        -286.235118
memory_size                        631363.298 

=== epoch 8/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:36,  5.02it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.29it/s]
episodes                                   50
episode_length                         185.18
returns                            -65.892189
return_std                         130.751302
average_reward                      -0.350811
round_time             0 days 00:06:18.456897
episodes_test                            12.0
episode_length_test                     778.0
returns_test                       590.774214
return_std_test                    262.670568
average_reward_test                  0.775959
round_time_test        0 days 00:00:10.743545
round_time_total       0 days 00:06:18.457992
loss_total                         932.233495
loss_critic                       1236.893703
loss_actor                        -286.407423
memory_size                       633197.6035 

=== epoch 8/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.25it/s]
episodes                                   44
episode_length                     222.522727
returns                            -78.593153
return_std                         140.933674
average_reward                      -0.350612
round_time             0 days 00:06:21.601656
episodes_test                            14.0
episode_length_test                703.285714
returns_test                       494.635298
return_std_test                    280.461776
average_reward_test                  0.708885
round_time_test        0 days 00:00:10.805629
round_time_total       0 days 00:06:21.602748
loss_total                          928.62698
loss_critic                       1232.291306
loss_actor                        -286.030413
memory_size                        634973.723 

=== epoch 8/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:12,  5.34it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   39
episode_length                     219.794872
returns                             -76.74331
return_std                         136.495162
average_reward                      -0.354121
round_time             0 days 00:06:19.610358
episodes_test                            14.0
episode_length_test                684.285714
returns_test                       403.061426
return_std_test                     211.42748
average_reward_test                  0.596895
round_time_test        0 days 00:00:10.787435
round_time_total       0 days 00:06:19.611841
loss_total                          915.89679
loss_critic                         1216.4961
loss_actor                        -286.500539
memory_size                        636827.703 

=== epoch 8/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   59
episode_length                     155.694915
returns                             -52.52514
return_std                         108.323376
average_reward                      -0.340863
round_time             0 days 00:06:22.466442
episodes_test                            13.0
episode_length_test                733.692308
returns_test                       449.629666
return_std_test                    259.616265
average_reward_test                  0.613841
round_time_test        0 days 00:00:10.972001
round_time_total       0 days 00:06:22.467922
loss_total                         906.593553
loss_critic                       1204.947145
loss_actor                        -286.820889
memory_size                        638564.105 

=== epoch 8/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:22,  5.21it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   65
episode_length                     152.584615
returns                            -50.217157
return_std                         105.883214
average_reward                      -0.328842
round_time             0 days 00:06:22.337269
episodes_test                            15.0
episode_length_test                624.466667
returns_test                       459.981237
return_std_test                    335.460833
average_reward_test                  0.731634
round_time_test        0 days 00:00:10.987277
round_time_total       0 days 00:06:22.338444
loss_total                          921.16284
loss_critic                       1223.267655
loss_actor                        -287.256498
memory_size                        640294.705 

=== epoch 8/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:14,  5.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.21it/s]
episodes                                   72
episode_length                     131.805556
returns                            -41.217251
return_std                          91.387003
average_reward                      -0.318084
round_time             0 days 00:06:24.723037
episodes_test                            12.0
episode_length_test                787.916667
returns_test                       524.293162
return_std_test                    268.269625
average_reward_test                  0.675332
round_time_test        0 days 00:00:10.784902
round_time_total       0 days 00:06:24.724123
loss_total                         929.617778
loss_critic                       1233.787463
loss_actor                        -287.061041
memory_size                       641971.3885 

=== epoch 8/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:25,  5.17it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.18it/s]
episodes                                   87
episode_length                     104.264368
returns                            -29.692402
return_std                          78.720128
average_reward                      -0.292596
round_time             0 days 00:06:26.586765
episodes_test                            16.0
episode_length_test                  580.6875
returns_test                       437.667727
return_std_test                    322.923551
average_reward_test                  0.752123
round_time_test        0 days 00:00:11.089526
round_time_total       0 days 00:06:26.587862
loss_total                         933.060312
loss_critic                       1238.177161
loss_actor                        -287.407156
memory_size                       643729.8755 

=== epoch 8/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.18it/s]
episodes                                   84
episode_length                      105.02381
returns                            -29.927319
return_std                          80.212456
average_reward                      -0.291054
round_time             0 days 00:06:26.875692
episodes_test                            14.0
episode_length_test                690.214286
returns_test                       453.583969
return_std_test                      316.4721
average_reward_test                  0.654652
round_time_test        0 days 00:00:11.033230
round_time_total       0 days 00:06:26.876845
loss_total                         928.589648
loss_critic                       1232.593718
loss_actor                        -287.426714
memory_size                        645402.428 

=== epoch 8/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:17,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   72
episode_length                     135.902778
returns                            -40.111347
return_std                         100.353034
average_reward                      -0.295166
round_time             0 days 00:06:25.858184
episodes_test                            15.0
episode_length_test                     660.4
returns_test                       485.295996
return_std_test                    307.285909
average_reward_test                  0.735183
round_time_test        0 days 00:00:10.835946
round_time_total       0 days 00:06:25.859299
loss_total                         938.202568
loss_critic                       1244.661632
loss_actor                        -287.633772
memory_size                       647279.4185 

=== epoch 8/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:00,  4.14it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   70
episode_length                     135.614286
returns                            -40.583583
return_std                         102.900836
average_reward                      -0.298787
round_time             0 days 00:06:29.815502
episodes_test                            13.0
episode_length_test                725.230769
returns_test                       560.654442
return_std_test                    318.040411
average_reward_test                  0.768406
round_time_test        0 days 00:00:11.002978
round_time_total       0 days 00:06:29.816605
loss_total                         932.441355
loss_critic                       1237.436438
loss_actor                        -287.539057
memory_size                        649122.244 

=== epoch 8/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:30,  5.10it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.17it/s]
episodes                                   52
episode_length                     166.192308
returns                            -52.654893
return_std                         116.585281
average_reward                      -0.321444
round_time             0 days 00:06:27.734418
episodes_test                            13.0
episode_length_test                754.538462
returns_test                       516.633585
return_std_test                     286.46316
average_reward_test                  0.680361
round_time_test        0 days 00:00:10.881046
round_time_total       0 days 00:06:27.735533
loss_total                          923.73362
loss_critic                       1226.568817
loss_actor                         -287.60726
memory_size                       650986.9565 

=== epoch 8/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:56,  4.78it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   37
episode_length                     242.810811
returns                            -81.646107
return_std                         142.022701
average_reward                      -0.343273
round_time             0 days 00:06:30.808549
episodes_test                            11.0
episode_length_test                865.818182
returns_test                       590.710513
return_std_test                    213.038142
average_reward_test                  0.672477
round_time_test        0 days 00:00:10.920081
round_time_total       0 days 00:06:30.809664
loss_total                          936.06041
loss_critic                       1242.080146
loss_actor                        -288.018624
memory_size                        652920.158 

=== epoch 8/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:26,  5.15it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   38
episode_length                     260.605263
returns                            -90.687012
return_std                         147.475391
average_reward                      -0.343964
round_time             0 days 00:06:26.211426
episodes_test                            12.0
episode_length_test                795.416667
returns_test                       539.125862
return_std_test                    214.175019
average_reward_test                  0.678737
round_time_test        0 days 00:00:10.892181
round_time_total       0 days 00:06:26.212527
loss_total                         940.566547
loss_critic                       1247.891772
loss_actor                        -288.734438
memory_size                       654766.2375 

=== epoch 8/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:57,  4.77it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   51
episode_length                     178.862745
returns                             -58.88808
return_std                         125.270563
average_reward                      -0.334641
round_time             0 days 00:06:28.535125
episodes_test                            12.0
episode_length_test                791.166667
returns_test                       601.537385
return_std_test                    285.635254
average_reward_test                   0.74837
round_time_test        0 days 00:00:10.806568
round_time_total       0 days 00:06:28.536266
loss_total                         918.903375
loss_critic                       1220.774887
loss_actor                        -288.582751
memory_size                        656468.444 

=== epoch 8/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   61
episode_length                     139.098361
returns                            -45.401133
return_std                          111.23605
average_reward                      -0.334505
round_time             0 days 00:06:32.243639
episodes_test                            15.0
episode_length_test                645.533333
returns_test                       524.906503
return_std_test                    362.909515
average_reward_test                  0.815933
round_time_test        0 days 00:00:10.856909
round_time_total       0 days 00:06:32.244753
loss_total                         918.698276
loss_critic                       1220.548733
loss_actor                        -288.703627
memory_size                       658056.1745 

=== epoch 8/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:55,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   68
episode_length                     133.985294
returns                            -43.990698
return_std                         103.866658
average_reward                      -0.338143
round_time             0 days 00:06:32.008866
episodes_test                            14.0
episode_length_test                     688.5
returns_test                       499.098986
return_std_test                    324.616939
average_reward_test                   0.71927
round_time_test        0 days 00:00:10.924138
round_time_total       0 days 00:06:32.010194
loss_total                         925.931869
loss_critic                       1229.476286
loss_actor                        -288.245887
memory_size                       659893.5405 

=== epoch 8/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:54,  4.20it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   67
episode_length                     135.940299
returns                            -45.400284
return_std                          108.41269
average_reward                      -0.338589
round_time             0 days 00:06:34.167471
episodes_test                            13.0
episode_length_test                736.384615
returns_test                       592.176647
return_std_test                    323.645963
average_reward_test                  0.804011
round_time_test        0 days 00:00:10.824187
round_time_total       0 days 00:06:34.168572
loss_total                         921.939852
loss_critic                       1224.787119
loss_actor                        -289.449302
memory_size                        661741.552 

=== epoch 8/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.09it/s]
episodes                                   66
episode_length                     150.560606
returns                            -52.805655
return_std                         121.070633
average_reward                      -0.348193
round_time             0 days 00:06:33.379151
episodes_test                            10.0
episode_length_test                     901.7
returns_test                       576.248567
return_std_test                     195.57611
average_reward_test                  0.637018
round_time_test        0 days 00:00:10.715671
round_time_total       0 days 00:06:33.380260
loss_total                         911.757539
loss_critic                       1212.076462
loss_actor                        -289.518237
memory_size                        663637.455 

=== epoch 8/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:39,  4.99it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   54
episode_length                     184.166667
returns                            -62.665731
return_std                         131.833096
average_reward                      -0.337437
round_time             0 days 00:06:35.203801
episodes_test                            14.0
episode_length_test                695.785714
returns_test                       530.837874
return_std_test                    331.845913
average_reward_test                  0.760154
round_time_test        0 days 00:00:11.094201
round_time_total       0 days 00:06:35.205158
loss_total                         916.604166
loss_critic                       1218.070104
loss_actor                        -289.259665
memory_size                       665516.4335 

=== epoch 8/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   44
episode_length                     213.477273
returns                            -73.501336
return_std                         144.892234
average_reward                      -0.346944
round_time             0 days 00:06:38.663993
episodes_test                            12.0
episode_length_test                816.916667
returns_test                       584.640101
return_std_test                    297.112231
average_reward_test                  0.716874
round_time_test        0 days 00:00:11.013535
round_time_total       0 days 00:06:38.665115
loss_total                         907.468198
loss_critic                       1206.695991
loss_actor                        -289.443056
memory_size                        667293.019 

=== epoch 8/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:21,  4.52it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   44
episode_length                     206.659091
returns                            -68.426942
return_std                         143.159657
average_reward                      -0.334249
round_time             0 days 00:06:38.699828
episodes_test                            14.0
episode_length_test                709.428571
returns_test                       534.131648
return_std_test                    328.702076
average_reward_test                  0.753463
round_time_test        0 days 00:00:10.838076
round_time_total       0 days 00:06:38.700950
loss_total                         919.873864
loss_critic                       1222.709506
loss_actor                        -291.468786
memory_size                        669154.409 

=== epoch 8/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   51
episode_length                     169.411765
returns                            -54.226658
return_std                         121.690822
average_reward                      -0.328282
round_time             0 days 00:06:37.534863
episodes_test                            13.0
episode_length_test                754.384615
returns_test                       513.875009
return_std_test                    286.341004
average_reward_test                  0.683725
round_time_test        0 days 00:00:10.890672
round_time_total       0 days 00:06:37.535977
loss_total                         925.846611
loss_critic                        1230.13922
loss_actor                         -291.32391
memory_size                       670954.1055 

=== epoch 8/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.68it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   55
episode_length                     172.654545
returns                            -57.168462
return_std                         122.839975
average_reward                       -0.32049
round_time             0 days 00:06:38.114997
episodes_test                            11.0
episode_length_test                848.181818
returns_test                       616.418627
return_std_test                    284.050854
average_reward_test                  0.738633
round_time_test        0 days 00:00:10.977928
round_time_total       0 days 00:06:38.116496
loss_total                         925.002001
loss_critic                       1229.165053
loss_actor                        -291.650301
memory_size                       672741.9325 

=== epoch 8/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:20,  4.53it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   46
episode_length                      216.23913
returns                            -76.407129
return_std                         135.225567
average_reward                      -0.353825
round_time             0 days 00:06:40.132215
episodes_test                            13.0
episode_length_test                716.076923
returns_test                       484.221418
return_std_test                    285.368324
average_reward_test                  0.676802
round_time_test        0 days 00:00:10.651109
round_time_total       0 days 00:06:40.133319
loss_total                          916.41592
loss_critic                       1218.520764
loss_actor                        -292.003529
memory_size                       674659.3265 

=== epoch 8/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   49
episode_length                     190.408163
returns                            -64.798648
return_std                         119.528352
average_reward                      -0.344027
round_time             0 days 00:06:41.814294
episodes_test                            13.0
episode_length_test                765.769231
returns_test                        562.33141
return_std_test                    267.177221
average_reward_test                  0.735518
round_time_test        0 days 00:00:10.906523
round_time_total       0 days 00:06:41.815397
loss_total                         908.243552
loss_critic                       1208.474789
loss_actor                        -292.681479
memory_size                        676545.068 

=== epoch 8/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.87it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.01it/s]
episodes                                   45
episode_length                     203.822222
returns                            -69.189713
return_std                         127.402758
average_reward                      -0.345212
round_time             0 days 00:06:39.483913
episodes_test                            15.0
episode_length_test                652.133333
returns_test                       459.250752
return_std_test                    262.163459
average_reward_test                  0.709494
round_time_test        0 days 00:00:10.846557
round_time_total       0 days 00:06:39.485123
loss_total                         927.051106
loss_critic                       1231.832199
loss_actor                        -292.073357
memory_size                        678389.035 

=== epoch 8/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:51,  4.84it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   36
episode_length                     243.944444
returns                            -82.403422
return_std                         138.975379
average_reward                      -0.347517
round_time             0 days 00:06:41.092591
episodes_test                            10.0
episode_length_test                     927.6
returns_test                        580.68021
return_std_test                     235.84878
average_reward_test                  0.631245
round_time_test        0 days 00:00:10.898354
round_time_total       0 days 00:06:41.093672
loss_total                         898.860352
loss_critic                        1196.61904
loss_actor                        -292.174484
memory_size                        680195.634 

=== epoch 8/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:34,  3.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   35
episode_length                     251.885714
returns                            -87.366435
return_std                         149.039579
average_reward                      -0.353613
round_time             0 days 00:06:41.177988
episodes_test                            12.0
episode_length_test                800.166667
returns_test                       608.783893
return_std_test                    289.164422
average_reward_test                  0.753581
round_time_test        0 days 00:00:10.808614
round_time_total       0 days 00:06:41.179095
loss_total                         886.560007
loss_critic                       1181.109011
loss_actor                          -291.6361
memory_size                       682127.8765 

=== epoch 8/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:45,  4.91it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   37
episode_length                     262.108108
returns                            -91.045616
return_std                         157.492896
average_reward                      -0.342454
round_time             0 days 00:06:41.181307
episodes_test                            17.0
episode_length_test                547.235294
returns_test                       353.532409
return_std_test                    281.794334
average_reward_test                  0.653709
round_time_test        0 days 00:00:10.559388
round_time_total       0 days 00:06:41.182430
loss_total                         922.477155
loss_critic                       1226.094091
loss_actor                        -291.990675
memory_size                       684036.1155 

=== epoch 8/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.89it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   37
episode_length                     237.432432
returns                            -80.772088
return_std                         146.337411
average_reward                      -0.345066
round_time             0 days 00:06:41.900553
episodes_test                            12.0
episode_length_test                812.833333
returns_test                        571.22123
return_std_test                    273.535743
average_reward_test                  0.709056
round_time_test        0 days 00:00:10.954317
round_time_total       0 days 00:06:41.901670
loss_total                         928.701221
loss_critic                       1233.849338
loss_actor                        -291.891335
memory_size                        685888.641 

=== epoch 8/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:14,  4.59it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   46
episode_length                     214.956522
returns                            -77.872468
return_std                           142.6319
average_reward                      -0.364015
round_time             0 days 00:06:39.896859
episodes_test                            16.0
episode_length_test                   582.875
returns_test                       389.393842
return_std_test                    310.743233
average_reward_test                  0.662721
round_time_test        0 days 00:00:10.741671
round_time_total       0 days 00:06:39.898142
loss_total                         920.073122
loss_critic                       1223.308715
loss_actor                        -292.869336
memory_size                       687666.3535 

=== epoch 8/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:27,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.95it/s]
episodes                                   51
episode_length                      183.45098
returns                            -64.582534
return_std                           131.9437
average_reward                      -0.353597
round_time             0 days 00:06:44.927509
episodes_test                            13.0
episode_length_test                721.153846
returns_test                       431.050856
return_std_test                    251.880117
average_reward_test                   0.60678
round_time_test        0 days 00:00:10.879730
round_time_total       0 days 00:06:44.928609
loss_total                         915.447853
loss_critic                       1217.220076
loss_actor                        -291.641122
memory_size                       689352.9555 

=== epoch 8/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:44<00:00,  4.94it/s]
episodes                                   51
episode_length                     164.960784
returns                            -54.424805
return_std                         117.607988
average_reward                      -0.338625
round_time             0 days 00:06:45.296662
episodes_test                            14.0
episode_length_test                674.214286
returns_test                       480.972263
return_std_test                    280.846675
average_reward_test                  0.715149
round_time_test        0 days 00:00:10.711783
round_time_total       0 days 00:06:45.297785
loss_total                         916.019692
loss_critic                       1218.190771
loss_actor                        -292.664711
memory_size                       691246.2755 

=== epoch 8/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   61
episode_length                     145.737705
returns                            -49.378338
return_std                         115.399847
average_reward                      -0.339771
round_time             0 days 00:06:43.985122
episodes_test                            13.0
episode_length_test                764.769231
returns_test                       544.679151
return_std_test                    229.354402
average_reward_test                  0.713164
round_time_test        0 days 00:00:10.855065
round_time_total       0 days 00:06:43.986235
loss_total                         904.342805
loss_critic                       1203.694624
loss_actor                        -293.064553
memory_size                       693014.4315 

=== epoch 8/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:30,  5.10it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   52
episode_length                     177.173077
returns                             -57.81269
return_std                         125.610803
average_reward                      -0.336661
round_time             0 days 00:06:43.062910
episodes_test                            10.0
episode_length_test                     902.7
returns_test                       577.503324
return_std_test                    193.787528
average_reward_test                  0.634126
round_time_test        0 days 00:00:11.149950
round_time_total       0 days 00:06:43.064017
loss_total                         906.897936
loss_critic                       1207.160593
loss_actor                        -294.152778
memory_size                        694918.762 

=== epoch 8/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   39
episode_length                     248.974359
returns                              -83.8439
return_std                         157.952814
average_reward                        -0.3412
round_time             0 days 00:06:45.913563
episodes_test                            13.0
episode_length_test                754.923077
returns_test                        539.97179
return_std_test                    274.637526
average_reward_test                  0.716455
round_time_test        0 days 00:00:10.654939
round_time_total       0 days 00:06:45.915030
loss_total                         912.551592
loss_critic                       1214.352049
loss_actor                        -294.650316
memory_size                        696862.775 

=== epoch 8/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:39,  4.34it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.93it/s]
episodes                                   40
episode_length                         218.85
returns                            -75.950823
return_std                         148.626611
average_reward                      -0.345269
round_time             0 days 00:06:46.334089
episodes_test                            12.0
episode_length_test                755.166667
returns_test                       472.337627
return_std_test                    236.223363
average_reward_test                  0.615552
round_time_test        0 days 00:00:10.712126
round_time_total       0 days 00:06:46.335189
loss_total                         909.275975
loss_critic                       1210.459966
loss_actor                        -295.460072
memory_size                        698744.145 

=== epoch 8/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:24,  4.48it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:45<00:00,  4.94it/s]
episodes                                   44
episode_length                     226.477273
returns                            -78.974563
return_std                         145.920052
average_reward                      -0.346721
round_time             0 days 00:06:45.759298
episodes_test                            13.0
episode_length_test                765.923077
returns_test                       534.056259
return_std_test                    282.467277
average_reward_test                  0.697762
round_time_test        0 days 00:00:10.846842
round_time_total       0 days 00:06:45.760727
loss_total                         912.074022
loss_critic                       1214.019039
loss_actor                        -295.706127
memory_size                       700519.6265 

=== epoch 8/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:47<00:00,  4.91it/s]
episodes                                   45
episode_length                     206.288889
returns                            -71.848629
return_std                         134.253139
average_reward                      -0.346819
round_time             0 days 00:06:47.617454
episodes_test                            12.0
episode_length_test                759.583333
returns_test                       527.147135
return_std_test                    286.605728
average_reward_test                  0.669311
round_time_test        0 days 00:00:10.963315
round_time_total       0 days 00:06:47.618932
loss_total                         899.306399
loss_critic                       1197.914497
loss_actor                        -295.126079
memory_size                       702284.6305 

=== epoch 8/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.12it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:49<00:00,  4.89it/s]
episodes                                   56
episode_length                     161.571429
returns                            -57.533426
return_std                         118.400243
average_reward                      -0.355258
round_time             0 days 00:06:49.615152
episodes_test                            15.0
episode_length_test                     622.0
returns_test                       493.192974
return_std_test                    317.001709
average_reward_test                  0.783806
round_time_test        0 days 00:00:10.825570
round_time_total       0 days 00:06:49.616404
loss_total                         902.673713
loss_critic                       1202.149531
loss_actor                        -295.229637
memory_size                        703971.422 

=== epoch 8/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:52,  4.22it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:46<00:00,  4.92it/s]
episodes                                   62
episode_length                     149.225806
returns                            -48.973152
return_std                         103.057646
average_reward                      -0.333265
round_time             0 days 00:06:46.998107
episodes_test                            10.0
episode_length_test                     946.8
returns_test                       755.967724
return_std_test                    121.337189
average_reward_test                  0.803696
round_time_test        0 days 00:00:10.872348
round_time_total       0 days 00:06:46.999206
loss_total                         903.626214
loss_critic                       1203.463611
loss_actor                        -295.723458
memory_size                        705848.517 

=== epoch 8/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:22,  4.50it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:47<00:00,  4.91it/s]
episodes                                   51
episode_length                     167.313725
returns                            -57.424501
return_std                         114.019686
average_reward                      -0.347098
round_time             0 days 00:06:47.790169
episodes_test                            13.0
episode_length_test                750.461538
returns_test                       502.666834
return_std_test                    273.706103
average_reward_test                  0.671604
round_time_test        0 days 00:00:10.888800
round_time_total       0 days 00:06:47.791639
loss_total                         905.970001
loss_critic                       1206.685874
loss_actor                        -296.893574
memory_size                        707731.759 

=== epoch 8/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:47,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:47<00:00,  4.91it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   57
episode_length                     173.473684
returns                            -58.159237
return_std                         116.858077
average_reward                      -0.333196
round_time             0 days 00:06:48.114831
episodes_test                            11.0
episode_length_test                846.272727
returns_test                       523.087634
return_std_test                     218.40314
average_reward_test                  0.616099
round_time_test        0 days 00:00:10.822508
round_time_total       0 days 00:06:48.115933
loss_total                         916.083518
loss_critic                       1219.410232
loss_actor                        -297.223422
memory_size                        709640.772 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 9/10 ===== round 1/50 ======================================
  1%|          | 12/2000 [00:02<06:16,  5.28it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:04<00:00,  5.49it/s]
episodes                                   30
episode_length                      50.333333
returns                             -4.149707
return_std                          22.154411
average_reward                      -0.143382
round_time             0 days 00:06:04.556134
episodes_test                            12.0
episode_length_test                824.666667
returns_test                       548.273913
return_std_test                    258.365051
average_reward_test                  0.669771
round_time_test        0 days 00:00:11.050199
round_time_total       0 days 00:06:04.557238
loss_total                         916.554922
loss_critic                       1219.928593
loss_actor                        -296.939852
memory_size                       711151.4805 

=== epoch 9/10 ===== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:19,  5.25it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:02<00:00,  5.51it/s]
episodes                                   40
episode_length                         99.975
returns                            -22.320458
return_std                           78.47732
average_reward                      -0.223291
round_time             0 days 00:06:03.332554
episodes_test                            11.0
episode_length_test                837.909091
returns_test                        591.74554
return_std_test                    218.066361
average_reward_test                  0.704053
round_time_test        0 days 00:00:10.838157
round_time_total       0 days 00:06:03.333650
loss_total                         906.654265
loss_critic                       1207.387948
loss_actor                        -296.280551
memory_size                        712917.124 

=== epoch 9/10 ===== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<05:58,  5.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:08<00:00,  5.42it/s]
episodes                                   44
episode_length                     119.431818
returns                            -32.350865
return_std                          96.396568
average_reward                      -0.282435
round_time             0 days 00:06:09.385884
episodes_test                            13.0
episode_length_test                746.307692
returns_test                       429.457442
return_std_test                     236.72382
average_reward_test                  0.585272
round_time_test        0 days 00:00:10.918879
round_time_total       0 days 00:06:09.387365
loss_total                         924.763338
loss_critic                       1229.782433
loss_actor                        -295.313124
memory_size                       714738.2415 

=== epoch 9/10 ===== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:46,  4.90it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   48
episode_length                         153.75
returns                            -46.144495
return_std                         115.823497
average_reward                      -0.295899
round_time             0 days 00:06:12.824796
episodes_test                            13.0
episode_length_test                745.384615
returns_test                       502.382336
return_std_test                    279.655073
average_reward_test                  0.677274
round_time_test        0 days 00:00:10.972469
round_time_total       0 days 00:06:12.825878
loss_total                         911.747732
loss_critic                       1213.690934
loss_actor                        -296.025163
memory_size                       716680.1335 

=== epoch 9/10 ===== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:21,  5.23it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.39it/s]
episodes                                   68
episode_length                     141.014706
returns                            -42.685171
return_std                         102.959969
average_reward                       -0.30321
round_time             0 days 00:06:11.498617
episodes_test                            13.0
episode_length_test                753.615385
returns_test                       514.771129
return_std_test                    233.486236
average_reward_test                  0.688493
round_time_test        0 days 00:00:10.959306
round_time_total       0 days 00:06:11.499728
loss_total                         906.341245
loss_critic                       1206.951991
loss_actor                         -296.10182
memory_size                       718478.0255 

=== epoch 9/10 ===== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:19,  5.25it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:14<00:00,  5.35it/s]
episodes                                   45
episode_length                     208.933333
returns                             -69.24239
return_std                          127.51511
average_reward                       -0.32649
round_time             0 days 00:06:14.636341
episodes_test                            14.0
episode_length_test                690.214286
returns_test                       380.908681
return_std_test                     239.54427
average_reward_test                   0.55918
round_time_test        0 days 00:00:10.624314
round_time_total       0 days 00:06:14.637578
loss_total                         914.915301
loss_critic                         1217.6796
loss_actor                        -296.141968
memory_size                        720245.564 

=== epoch 9/10 ===== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:15<00:00,  5.33it/s]
episodes                                   57
episode_length                     171.877193
returns                             -56.84679
return_std                         120.488765
average_reward                      -0.328637
round_time             0 days 00:06:15.861682
episodes_test                            18.0
episode_length_test                537.666667
returns_test                       377.038326
return_std_test                    263.245777
average_reward_test                  0.708188
round_time_test        0 days 00:00:10.887121
round_time_total       0 days 00:06:15.862775
loss_total                         906.409986
loss_critic                       1206.978408
loss_actor                        -295.863788
memory_size                       722038.0975 

=== epoch 9/10 ===== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:32,  5.08it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.31it/s]
episodes                                   59
episode_length                     152.084746
returns                            -48.859258
return_std                          110.63314
average_reward                      -0.326631
round_time             0 days 00:06:17.447767
episodes_test                            16.0
episode_length_test                  576.4375
returns_test                       442.306261
return_std_test                    296.392206
average_reward_test                  0.758001
round_time_test        0 days 00:00:10.825732
round_time_total       0 days 00:06:17.448861
loss_total                         911.837381
loss_critic                       1213.813562
loss_actor                        -296.067429
memory_size                        723756.301 

=== epoch 9/10 ===== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.68it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.29it/s]
episodes                                   65
episode_length                     143.523077
returns                            -48.011962
return_std                         110.098342
average_reward                      -0.333365
round_time             0 days 00:06:18.536334
episodes_test                            14.0
episode_length_test                667.428571
returns_test                       433.633531
return_std_test                    216.979285
average_reward_test                  0.653429
round_time_test        0 days 00:00:10.849513
round_time_total       0 days 00:06:18.537469
loss_total                         921.983977
loss_critic                       1226.440112
loss_actor                        -295.840653
memory_size                        725618.198 

=== epoch 9/10 ===== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:43,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   50
episode_length                          185.8
returns                             -63.27583
return_std                         134.115096
average_reward                      -0.342333
round_time             0 days 00:06:19.517356
episodes_test                            17.0
episode_length_test                582.294118
returns_test                       460.829864
return_std_test                    308.246683
average_reward_test                   0.79048
round_time_test        0 days 00:00:10.704346
round_time_total       0 days 00:06:19.518637
loss_total                          913.15632
loss_critic                       1215.379327
loss_actor                        -295.735789
memory_size                        727494.028 

=== epoch 9/10 ===== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:14,  4.58it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:20<00:00,  5.26it/s]
episodes                                   69
episode_length                     144.289855
returns                            -45.906521
return_std                         112.761126
average_reward                      -0.316433
round_time             0 days 00:06:21.076030
episodes_test                            17.0
episode_length_test                585.058824
returns_test                       386.216233
return_std_test                    283.970604
average_reward_test                  0.661348
round_time_test        0 days 00:00:10.786057
round_time_total       0 days 00:06:21.077162
loss_total                         929.035739
loss_critic                       1235.297746
loss_actor                        -296.012372
memory_size                        729250.722 

=== epoch 9/10 ===== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:24,  5.18it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:22<00:00,  5.23it/s]
episodes                                   64
episode_length                      152.21875
returns                            -48.124512
return_std                         109.977341
average_reward                      -0.313017
round_time             0 days 00:06:23.179206
episodes_test                            10.0
episode_length_test                     935.7
returns_test                        598.70979
return_std_test                    126.115839
average_reward_test                  0.664276
round_time_test        0 days 00:00:10.650426
round_time_total       0 days 00:06:23.180344
loss_total                         917.648867
loss_critic                       1221.120261
loss_actor                          -296.2368
memory_size                        730885.412 

=== epoch 9/10 ===== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:32,  4.40it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.25it/s]
episodes                                   66
episode_length                     134.742424
returns                             -40.01058
return_std                          98.260432
average_reward                      -0.305428
round_time             0 days 00:06:21.903760
episodes_test                            12.0
episode_length_test                833.083333
returns_test                       557.277223
return_std_test                    227.876384
average_reward_test                  0.668591
round_time_test        0 days 00:00:10.727363
round_time_total       0 days 00:06:21.905004
loss_total                         907.473185
loss_critic                       1208.521161
loss_actor                        -296.718804
memory_size                       732636.0895 

=== epoch 9/10 ===== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   63
episode_length                     136.539683
returns                            -41.393643
return_std                         101.703555
average_reward                      -0.306689
round_time             0 days 00:06:26.167438
episodes_test                            14.0
episode_length_test                684.785714
returns_test                       558.727744
return_std_test                     268.96062
average_reward_test                  0.820096
round_time_test        0 days 00:00:10.872377
round_time_total       0 days 00:06:26.168595
loss_total                         905.750902
loss_critic                       1206.485443
loss_actor                        -297.187355
memory_size                        734515.276 

=== epoch 9/10 ===== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:02,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   76
episode_length                     123.092105
returns                            -33.459322
return_std                          87.745938
average_reward                      -0.279052
round_time             0 days 00:06:27.320675
episodes_test                            17.0
episode_length_test                566.823529
returns_test                       408.656909
return_std_test                    309.120131
average_reward_test                  0.731055
round_time_test        0 days 00:00:10.972635
round_time_total       0 days 00:06:27.321782
loss_total                         915.217896
loss_critic                       1218.248654
loss_actor                        -296.905231
memory_size                        736288.871 

=== epoch 9/10 ===== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:46,  4.27it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   54
episode_length                     183.425926
returns                            -57.785134
return_std                         113.530212
average_reward                      -0.307607
round_time             0 days 00:06:27.159298
episodes_test                            11.0
episode_length_test                852.272727
returns_test                       679.851903
return_std_test                    248.656291
average_reward_test                  0.792793
round_time_test        0 days 00:00:10.812160
round_time_total       0 days 00:06:27.160386
loss_total                         906.797207
loss_critic                       1207.757292
loss_actor                        -297.043208
memory_size                       738119.8005 

=== epoch 9/10 ===== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:50,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.17it/s]
episodes                                   40
episode_length                        230.325
returns                            -71.538103
return_std                         127.551843
average_reward                      -0.314546
round_time             0 days 00:06:27.633911
episodes_test                            14.0
episode_length_test                691.285714
returns_test                       486.541883
return_std_test                    294.404291
average_reward_test                  0.709248
round_time_test        0 days 00:00:10.791325
round_time_total       0 days 00:06:27.635017
loss_total                         907.113142
loss_critic                       1208.163335
loss_actor                        -297.087714
memory_size                        740041.952 

=== epoch 9/10 ===== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:23,  4.50it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.17it/s]
episodes                                   51
episode_length                     193.607843
returns                            -55.255524
return_std                         113.432545
average_reward                      -0.287456
round_time             0 days 00:06:27.300475
episodes_test                            13.0
episode_length_test                742.461538
returns_test                       522.349243
return_std_test                    277.848367
average_reward_test                  0.706409
round_time_test        0 days 00:00:10.914604
round_time_total       0 days 00:06:27.301888
loss_total                         915.909761
loss_critic                       1219.171189
loss_actor                        -297.136035
memory_size                        741842.935 

=== epoch 9/10 ===== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   54
episode_length                     170.462963
returns                            -45.594396
return_std                         101.950202
average_reward                      -0.272509
round_time             0 days 00:06:28.122086
episodes_test                            12.0
episode_length_test                767.833333
returns_test                       535.815559
return_std_test                    247.589197
average_reward_test                  0.714869
round_time_test        0 days 00:00:10.999768
round_time_total       0 days 00:06:28.123365
loss_total                         936.675712
loss_critic                       1244.937065
loss_actor                         -296.36979
memory_size                        743582.466 

=== epoch 9/10 ===== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:15,  4.57it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.13it/s]
episodes                                   40
episode_length                          227.4
returns                            -66.660767
return_std                         131.063877
average_reward                      -0.296454
round_time             0 days 00:06:30.466482
episodes_test                            14.0
episode_length_test                     700.0
returns_test                       436.804459
return_std_test                    235.323338
average_reward_test                  0.628777
round_time_test        0 days 00:00:10.906944
round_time_total       0 days 00:06:30.467576
loss_total                         932.273834
loss_critic                        1239.69623
loss_actor                        -297.415847
memory_size                        745486.245 

=== epoch 9/10 ===== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:23,  5.19it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.13it/s]
episodes                                   47
episode_length                     192.021277
returns                            -57.455895
return_std                         127.659954
average_reward                      -0.306494
round_time             0 days 00:06:30.736850
episodes_test                            12.0
episode_length_test                756.083333
returns_test                       655.656635
return_std_test                    299.396703
average_reward_test                  0.857541
round_time_test        0 days 00:00:10.895768
round_time_total       0 days 00:06:30.737973
loss_total                         933.754164
loss_critic                       1241.686248
loss_actor                        -297.974257
memory_size                        747343.681 

=== epoch 9/10 ===== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   49
episode_length                     185.938776
returns                            -56.027854
return_std                         126.659684
average_reward                      -0.303495
round_time             0 days 00:06:32.124756
episodes_test                            14.0
episode_length_test                708.785714
returns_test                       478.695783
return_std_test                     210.00588
average_reward_test                  0.679963
round_time_test        0 days 00:00:10.680068
round_time_total       0 days 00:06:32.126101
loss_total                         917.230348
loss_critic                       1221.236274
loss_actor                        -298.793439
memory_size                       749218.8885 

=== epoch 9/10 ===== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:36,  4.36it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.13it/s]
episodes                                   55
episode_length                     179.109091
returns                            -54.056192
return_std                         121.606389
average_reward                      -0.304904
round_time             0 days 00:06:30.602547
episodes_test                            14.0
episode_length_test                704.785714
returns_test                       594.802631
return_std_test                    349.229524
average_reward_test                  0.847261
round_time_test        0 days 00:00:10.810052
round_time_total       0 days 00:06:30.603665
loss_total                         931.053296
loss_critic                       1238.539826
loss_actor                        -298.892918
memory_size                        750979.997 

=== epoch 9/10 ===== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:33,  4.39it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.09it/s]
episodes                                   50
episode_length                         174.56
returns                            -52.798846
return_std                         121.922162
average_reward                      -0.308912
round_time             0 days 00:06:33.697269
episodes_test                            14.0
episode_length_test                662.214286
returns_test                       465.767487
return_std_test                     310.12538
average_reward_test                  0.702963
round_time_test        0 days 00:00:10.698031
round_time_total       0 days 00:06:33.698343
loss_total                         936.104948
loss_critic                       1244.769678
loss_actor                        -298.554049
memory_size                       752711.3025 

=== epoch 9/10 ===== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:06,  4.67it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   51
episode_length                     171.901961
returns                            -51.171791
return_std                         116.488075
average_reward                      -0.299407
round_time             0 days 00:06:34.290709
episodes_test                            12.0
episode_length_test                786.916667
returns_test                       570.801657
return_std_test                    226.735337
average_reward_test                  0.724527
round_time_test        0 days 00:00:10.629150
round_time_total       0 days 00:06:34.291842
loss_total                         931.674843
loss_critic                       1239.148542
loss_actor                        -298.220037
memory_size                       754625.2125 

=== epoch 9/10 ===== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:44,  4.93it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   51
episode_length                     158.764706
returns                            -46.384184
return_std                         105.816763
average_reward                      -0.305189
round_time             0 days 00:06:32.095249
episodes_test                            14.0
episode_length_test                     657.0
returns_test                       492.779541
return_std_test                     308.99829
average_reward_test                  0.748768
round_time_test        0 days 00:00:10.866506
round_time_total       0 days 00:06:32.096716
loss_total                         935.070417
loss_critic                       1243.505602
loss_actor                        -298.670412
memory_size                       756459.0275 

=== epoch 9/10 ===== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:47,  4.88it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   56
episode_length                     167.535714
returns                            -50.760384
return_std                         111.748876
average_reward                      -0.303478
round_time             0 days 00:06:39.096864
episodes_test                            16.0
episode_length_test                   607.125
returns_test                        431.18727
return_std_test                    294.042918
average_reward_test                   0.71522
round_time_test        0 days 00:00:11.031180
round_time_total       0 days 00:06:39.097977
loss_total                         935.149148
loss_critic                       1243.693593
loss_actor                         -299.02872
memory_size                        758316.012 

=== epoch 9/10 ===== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:41,  4.96it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   43
episode_length                     216.790698
returns                            -68.250214
return_std                         131.711051
average_reward                      -0.315975
round_time             0 days 00:06:37.615740
episodes_test                            13.0
episode_length_test                     767.0
returns_test                       530.300194
return_std_test                     274.82313
average_reward_test                  0.690682
round_time_test        0 days 00:00:10.617840
round_time_total       0 days 00:06:37.616842
loss_total                         922.389481
loss_critic                       1227.777559
loss_actor                         -299.16291
memory_size                        760104.073 

=== epoch 9/10 ===== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:25,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.05it/s]
episodes                                   48
episode_length                     200.833333
returns                            -62.471541
return_std                          120.89443
average_reward                      -0.317807
round_time             0 days 00:06:36.447436
episodes_test                            13.0
episode_length_test                759.846154
returns_test                       496.322737
return_std_test                    254.762774
average_reward_test                  0.657981
round_time_test        0 days 00:00:10.762029
round_time_total       0 days 00:06:36.448558
loss_total                         927.179625
loss_critic                        1233.55386
loss_actor                        -298.317395
memory_size                       761903.5795 

=== epoch 9/10 ===== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:36,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   47
episode_length                          204.0
returns                            -67.004687
return_std                           126.6475
average_reward                      -0.331312
round_time             0 days 00:06:37.891733
episodes_test                            10.0
episode_length_test                     926.0
returns_test                       647.005776
return_std_test                    250.694083
average_reward_test                  0.717735
round_time_test        0 days 00:00:11.104566
round_time_total       0 days 00:06:37.892891
loss_total                         927.038271
loss_critic                       1233.609673
loss_actor                        -299.247412
memory_size                       763816.6905 

=== epoch 9/10 ===== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:25,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   52
episode_length                     190.826923
returns                            -60.815422
return_std                         119.996945
average_reward                      -0.318791
round_time             0 days 00:06:34.132978
episodes_test                            17.0
episode_length_test                539.529412
returns_test                       447.222076
return_std_test                    326.077665
average_reward_test                  0.813746
round_time_test        0 days 00:00:10.901502
round_time_total       0 days 00:06:34.134106
loss_total                         924.499796
loss_critic                       1230.550141
loss_actor                        -299.701668
memory_size                       765576.0695 

=== epoch 9/10 ===== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:01,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.06it/s]
episodes                                   57
episode_length                     160.298246
returns                            -48.704351
return_std                         108.540956
average_reward                      -0.309007
round_time             0 days 00:06:35.882450
episodes_test                            14.0
episode_length_test                689.357143
returns_test                       506.499235
return_std_test                    310.614168
average_reward_test                  0.735647
round_time_test        0 days 00:00:10.932481
round_time_total       0 days 00:06:35.883544
loss_total                         936.714738
loss_critic                       1245.770473
loss_actor                        -299.508283
memory_size                        767337.499 

=== epoch 9/10 ===== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   55
episode_length                          163.2
returns                            -49.242756
return_std                         111.745637
average_reward                      -0.306069
round_time             0 days 00:06:38.966219
episodes_test                            21.0
episode_length_test                 446.47619
returns_test                       339.021551
return_std_test                    328.183422
average_reward_test                   0.74702
round_time_test        0 days 00:00:10.723957
round_time_total       0 days 00:06:38.967395
loss_total                         935.821233
loss_critic                       1245.023724
loss_actor                        -300.988818
memory_size                        769160.739 

=== epoch 9/10 ===== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:31,  4.41it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.03it/s]
episodes                                   65
episode_length                     139.338462
returns                             -40.67016
return_std                         103.864752
average_reward                      -0.300716
round_time             0 days 00:06:38.233351
episodes_test                            15.0
episode_length_test                     619.2
returns_test                       444.334523
return_std_test                    342.838892
average_reward_test                  0.732206
round_time_test        0 days 00:00:10.486945
round_time_total       0 days 00:06:38.234496
loss_total                         930.614451
loss_critic                       1238.293891
loss_actor                         -300.10339
memory_size                        770939.916 

=== epoch 9/10 ===== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:35,  4.37it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   66
episode_length                     139.045455
returns                            -40.434705
return_std                         102.805867
average_reward                      -0.300622
round_time             0 days 00:06:34.824178
episodes_test                            13.0
episode_length_test                738.923077
returns_test                       536.465023
return_std_test                    308.621425
average_reward_test                  0.746805
round_time_test        0 days 00:00:10.983910
round_time_total       0 days 00:06:34.825300
loss_total                         920.271097
loss_critic                        1225.25185
loss_actor                        -299.651999
memory_size                        772707.222 

=== epoch 9/10 ===== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:17,  4.55it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.05it/s]
episodes                                   67
episode_length                     138.940299
returns                            -39.713553
return_std                         104.962193
average_reward                      -0.291004
round_time             0 days 00:06:36.796314
episodes_test                            22.0
episode_length_test                443.409091
returns_test                       306.856399
return_std_test                    307.866715
average_reward_test                  0.690379
round_time_test        0 days 00:00:10.742385
round_time_total       0 days 00:06:36.797419
loss_total                          925.09482
loss_critic                       1231.396524
loss_actor                         -300.11208
memory_size                       774472.0785 

=== epoch 9/10 ===== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:44,  4.29it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:37<00:00,  5.04it/s]
episodes                                   70
episode_length                     123.842857
returns                            -35.640433
return_std                          98.962364
average_reward                      -0.293695
round_time             0 days 00:06:37.735404
episodes_test                            11.0
episode_length_test                861.181818
returns_test                       702.530146
return_std_test                    248.870171
average_reward_test                  0.803876
round_time_test        0 days 00:00:10.758907
round_time_total       0 days 00:06:37.736568
loss_total                         909.572636
loss_critic                       1212.023849
loss_actor                        -300.232298
memory_size                        776191.186 

=== epoch 9/10 ===== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   66
episode_length                     140.121212
returns                            -43.444805
return_std                         106.283414
average_reward                       -0.32072
round_time             0 days 00:06:38.790345
episodes_test                            16.0
episode_length_test                   615.625
returns_test                       436.901862
return_std_test                    301.383025
average_reward_test                  0.715545
round_time_test        0 days 00:00:10.780327
round_time_total       0 days 00:06:38.791943
loss_total                         923.361117
loss_critic                       1229.185653
loss_actor                        -299.937114
memory_size                       777994.5865 

=== epoch 9/10 ===== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   51
episode_length                     174.196078
returns                            -55.692577
return_std                         119.339209
average_reward                      -0.337568
round_time             0 days 00:06:40.504957
episodes_test                            12.0
episode_length_test                795.833333
returns_test                       649.059988
return_std_test                    317.174901
average_reward_test                  0.817513
round_time_test        0 days 00:00:10.846880
round_time_total       0 days 00:06:40.506283
loss_total                         936.818279
loss_critic                       1246.042453
loss_actor                        -300.078499
memory_size                       779929.4575 

=== epoch 9/10 ===== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:31,  4.41it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  5.00it/s]
episodes                                   60
episode_length                     163.116667
returns                            -53.669266
return_std                         114.496145
average_reward                      -0.329522
round_time             0 days 00:06:40.587785
episodes_test                            16.0
episode_length_test                  572.1875
returns_test                       465.067229
return_std_test                    335.692743
average_reward_test                  0.826631
round_time_test        0 days 00:00:10.706866
round_time_total       0 days 00:06:40.589148
loss_total                         920.916347
loss_critic                       1226.107011
loss_actor                        -299.846397
memory_size                       781718.1865 

=== epoch 9/10 ===== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:19,  4.53it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.01it/s]
episodes                                   57
episode_length                     153.070175
returns                            -51.885448
return_std                         107.038603
average_reward                      -0.346391
round_time             0 days 00:06:39.444914
episodes_test                            12.0
episode_length_test                    827.25
returns_test                       581.749083
return_std_test                    227.154036
average_reward_test                  0.703931
round_time_test        0 days 00:00:10.875523
round_time_total       0 days 00:06:39.446226
loss_total                         926.494547
loss_critic                       1233.032254
loss_actor                        -299.656368
memory_size                       783445.8995 

=== epoch 9/10 ===== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:36,  5.03it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
episodes                                   48
episode_length                     188.270833
returns                             -67.78092
return_std                         123.801643
average_reward                      -0.363767
round_time             0 days 00:06:42.424726
episodes_test                            14.0
episode_length_test                656.857143
returns_test                       458.997633
return_std_test                    329.552492
average_reward_test                  0.706552
round_time_test        0 days 00:00:10.959306
round_time_total       0 days 00:06:42.426028
loss_total                         935.391329
loss_critic                       1244.081565
loss_actor                        -299.369704
memory_size                        785253.279 

=== epoch 9/10 ===== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:23,  4.49it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   49
episode_length                     186.816327
returns                             -63.01036
return_std                          121.85449
average_reward                      -0.336391
round_time             0 days 00:06:38.923652
episodes_test                            15.0
episode_length_test                653.266667
returns_test                       495.978939
return_std_test                    300.221956
average_reward_test                  0.766001
round_time_test        0 days 00:00:10.687925
round_time_total       0 days 00:06:38.924913
loss_total                          927.88976
loss_critic                       1234.774722
loss_actor                        -299.650174
memory_size                       787153.9285 

=== epoch 9/10 ===== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:40,  4.97it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.00it/s]
episodes                                   57
episode_length                     166.140351
returns                             -50.30132
return_std                         114.212655
average_reward                      -0.313015
round_time             0 days 00:06:40.516104
episodes_test                            13.0
episode_length_test                704.384615
returns_test                       460.626537
return_std_test                    274.353495
average_reward_test                  0.675536
round_time_test        0 days 00:00:10.759647
round_time_total       0 days 00:06:40.517221
loss_total                         923.032895
loss_critic                       1228.966147
loss_actor                        -300.700197
memory_size                        789000.002 

=== epoch 9/10 ===== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:43,  4.30it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:40<00:00,  4.99it/s]
episodes                                   53
episode_length                     175.509434
returns                            -54.915478
return_std                         117.161197
average_reward                      -0.319602
round_time             0 days 00:06:41.498304
episodes_test                            12.0
episode_length_test                784.416667
returns_test                       684.174775
return_std_test                    309.281942
average_reward_test                  0.869004
round_time_test        0 days 00:00:10.797640
round_time_total       0 days 00:06:41.499397
loss_total                         914.523265
loss_critic                       1218.319561
loss_actor                        -300.662001
memory_size                        790789.525 

=== epoch 9/10 ===== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:29,  5.11it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   42
episode_length                     225.738095
returns                            -71.748355
return_std                         138.264309
average_reward                      -0.321045
round_time             0 days 00:06:39.637048
episodes_test                            14.0
episode_length_test                710.571429
returns_test                        482.49065
return_std_test                    290.634186
average_reward_test                  0.677898
round_time_test        0 days 00:00:10.487132
round_time_total       0 days 00:06:39.638146
loss_total                         933.912342
loss_critic                       1242.419829
loss_actor                        -300.117693
memory_size                       792666.2385 

=== epoch 9/10 ===== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<07:07,  4.66it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:42<00:00,  4.97it/s]
episodes                                   36
episode_length                         254.25
returns                            -81.279752
return_std                         146.080453
average_reward                      -0.320474
round_time             0 days 00:06:42.748767
episodes_test                            13.0
episode_length_test                     709.0
returns_test                       584.475059
return_std_test                     365.44251
average_reward_test                   0.81729
round_time_test        0 days 00:00:10.761488
round_time_total       0 days 00:06:42.749865
loss_total                          920.49448
loss_critic                       1225.768762
loss_actor                         -300.60273
memory_size                        794611.443 

=== epoch 9/10 ===== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   33
episode_length                     270.454545
returns                            -87.484429
return_std                         148.222896
average_reward                       -0.32616
round_time             0 days 00:06:43.993417
episodes_test                            16.0
episode_length_test                  615.6875
returns_test                       466.247133
return_std_test                    269.276284
average_reward_test                   0.76334
round_time_test        0 days 00:00:10.676427
round_time_total       0 days 00:06:43.994534
loss_total                         940.743897
loss_critic                       1251.129362
loss_actor                        -300.798055
memory_size                       796557.1335 

=== epoch 9/10 ===== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.73it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:43<00:00,  4.96it/s]
episodes                                   26
episode_length                     327.730769
returns                           -113.901972
return_std                         152.315189
average_reward                      -0.345128
round_time             0 days 00:06:44.022529
episodes_test                            18.0
episode_length_test                553.555556
returns_test                       459.398263
return_std_test                    359.351658
average_reward_test                  0.829283
round_time_test        0 days 00:00:10.831927
round_time_total       0 days 00:06:44.023846
loss_total                         945.427198
loss_critic                       1256.965767
loss_actor                        -300.727162
memory_size                        798457.776 

=== epoch 9/10 ===== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:49,  4.86it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:41<00:00,  4.98it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
episodes                                   30
episode_length                     308.866667
returns                            -99.922755
return_std                         149.907792
average_reward                      -0.322537
round_time             0 days 00:06:42.475657
episodes_test                            20.0
episode_length_test                    454.65
returns_test                        388.26676
return_std_test                    386.278689
average_reward_test                  0.839537
round_time_test        0 days 00:00:10.586461
round_time_total       0 days 00:06:42.476764
loss_total                         929.258649
loss_critic                       1236.893303
loss_actor                        -301.280055
memory_size                       800290.0805 


<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
=== epoch 10/10 ==== round 1/50 ======================================
  1%|          | 11/2000 [00:02<06:07,  5.41it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [05:59<00:00,  5.57it/s]
episodes                                    3
episode_length                     351.333333
returns                           -142.475718
return_std                         189.417187
average_reward                      -0.418558
round_time             0 days 00:05:59.145661
episodes_test                            14.0
episode_length_test                702.071429
returns_test                       522.237585
return_std_test                    288.274021
average_reward_test                  0.742272
round_time_test        0 days 00:00:10.745561
round_time_total       0 days 00:05:59.146777
loss_total                         909.495972
loss_critic                       1212.206249
loss_actor                        -301.345214
memory_size                       802137.1675 

=== epoch 10/10 ==== round 2/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:06,  5.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:02<00:00,  5.52it/s]
episodes                                   16
episode_length                         221.75
returns                            -78.885693
return_std                         157.262915
average_reward                      -0.365406
round_time             0 days 00:06:02.634429
episodes_test                            16.0
episode_length_test                     619.0
returns_test                       533.789075
return_std_test                    333.958245
average_reward_test                  0.865912
round_time_test        0 days 00:00:10.766354
round_time_total       0 days 00:06:02.635722
loss_total                         921.116877
loss_critic                       1226.771066
loss_actor                        -301.499962
memory_size                         803965.03 

=== epoch 10/10 ==== round 3/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:37,  5.01it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:04<00:00,  5.49it/s]
episodes                                   34
episode_length                     175.823529
returns                            -60.452239
return_std                         129.810554
average_reward                       -0.34296
round_time             0 days 00:06:04.635735
episodes_test                            17.0
episode_length_test                583.882353
returns_test                       400.269503
return_std_test                    231.436315
average_reward_test                  0.687228
round_time_test        0 days 00:00:10.684918
round_time_total       0 days 00:06:04.637019
loss_total                         923.985254
loss_critic                       1230.378995
loss_actor                        -301.589793
memory_size                        805795.665 

=== epoch 10/10 ==== round 4/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:11,  4.62it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.46it/s]
episodes                                   51
episode_length                     145.392157
returns                            -42.991147
return_std                         109.721109
average_reward                      -0.307406
round_time             0 days 00:06:06.872082
episodes_test                            12.0
episode_length_test                801.416667
returns_test                       607.187595
return_std_test                     260.30638
average_reward_test                  0.760906
round_time_test        0 days 00:00:10.895317
round_time_total       0 days 00:06:06.873391
loss_total                         916.449673
loss_critic                       1220.961856
loss_actor                        -301.599137
memory_size                        807420.894 

=== epoch 10/10 ==== round 5/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:05,  5.45it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.46it/s]
episodes                                   64
episode_length                     142.828125
returns                            -42.813671
return_std                          110.34212
average_reward                      -0.298724
round_time             0 days 00:06:07.073914
episodes_test                            15.0
episode_length_test                631.933333
returns_test                       568.292216
return_std_test                    337.912879
average_reward_test                  0.900541
round_time_test        0 days 00:00:10.605611
round_time_total       0 days 00:06:07.075047
loss_total                          923.77519
loss_critic                       1230.032455
loss_actor                        -301.253958
memory_size                       809186.5105 

=== epoch 10/10 ==== round 6/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<05:50,  5.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:06<00:00,  5.46it/s]
episodes                                   74
episode_length                     134.189189
returns                            -33.417954
return_std                          92.466501
average_reward                      -0.249704
round_time             0 days 00:06:06.910139
episodes_test                            10.0
episode_length_test                     960.1
returns_test                       690.357343
return_std_test                    117.161573
average_reward_test                  0.721007
round_time_test        0 days 00:00:10.706483
round_time_total       0 days 00:06:06.911278
loss_total                         912.148006
loss_critic                       1215.394398
loss_actor                        -300.837645
memory_size                       811027.7635 

=== epoch 10/10 ==== round 7/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:33,  5.06it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:08<00:00,  5.43it/s]
episodes                                   72
episode_length                     128.277778
returns                            -28.536518
return_std                          73.906437
average_reward                      -0.241743
round_time             0 days 00:06:09.059807
episodes_test                            14.0
episode_length_test                662.142857
returns_test                       497.975689
return_std_test                    283.881242
average_reward_test                  0.765568
round_time_test        0 days 00:00:10.485957
round_time_total       0 days 00:06:09.060897
loss_total                         924.173347
loss_critic                        1230.49604
loss_actor                        -301.117511
memory_size                       812767.2035 

=== epoch 10/10 ==== round 8/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:04,  4.69it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:08<00:00,  5.43it/s]
episodes                                   60
episode_length                          165.3
returns                            -42.933125
return_std                         103.206807
average_reward                       -0.25451
round_time             0 days 00:06:08.662350
episodes_test                            15.0
episode_length_test                     663.0
returns_test                       512.917129
return_std_test                    323.132251
average_reward_test                  0.774132
round_time_test        0 days 00:00:10.693323
round_time_total       0 days 00:06:08.663659
loss_total                          915.75415
loss_critic                       1220.015007
loss_actor                        -301.289362
memory_size                        814647.593 

=== epoch 10/10 ==== round 9/50 ======================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:28,  5.12it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   51
episode_length                     177.176471
returns                            -48.315917
return_std                         104.704353
average_reward                      -0.286554
round_time             0 days 00:06:10.599650
episodes_test                            13.0
episode_length_test                716.153846
returns_test                       543.573277
return_std_test                    288.522594
average_reward_test                  0.758796
round_time_test        0 days 00:00:10.484513
round_time_total       0 days 00:06:10.600738
loss_total                         917.394833
loss_critic                       1222.139871
loss_actor                        -301.585399
memory_size                        816477.695 

=== epoch 10/10 ==== round 10/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:12<00:00,  5.37it/s]
episodes                                   48
episode_length                     203.270833
returns                            -59.269874
return_std                         121.244205
average_reward                      -0.290964
round_time             0 days 00:06:12.907807
episodes_test                            11.0
episode_length_test                867.818182
returns_test                       682.429827
return_std_test                    260.150471
average_reward_test                   0.78725
round_time_test        0 days 00:00:10.805055
round_time_total       0 days 00:06:12.908893
loss_total                         918.297313
loss_critic                       1223.058934
loss_actor                        -300.749254
memory_size                       818329.2055 

=== epoch 10/10 ==== round 11/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:10,  5.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:10<00:00,  5.40it/s]
episodes                                   48
episode_length                     199.166667
returns                            -62.423427
return_std                         121.073583
average_reward                      -0.313697
round_time             0 days 00:06:10.642955
episodes_test                            10.0
episode_length_test                     901.2
returns_test                       694.511672
return_std_test                    233.181493
average_reward_test                  0.755356
round_time_test        0 days 00:00:10.747958
round_time_total       0 days 00:06:10.644051
loss_total                         919.907326
loss_critic                       1225.289022
loss_actor                        -301.619536
memory_size                       820117.6515 

=== epoch 10/10 ==== round 12/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.32it/s]
episodes                                   46
episode_length                     200.282609
returns                            -62.736498
return_std                         129.670029
average_reward                      -0.314453
round_time             0 days 00:06:16.572741
episodes_test                            15.0
episode_length_test                652.866667
returns_test                       493.700398
return_std_test                    299.101695
average_reward_test                  0.763462
round_time_test        0 days 00:00:10.295042
round_time_total       0 days 00:06:16.574014
loss_total                         910.038248
loss_critic                       1213.038687
loss_actor                        -301.963585
memory_size                       821976.8375 

=== epoch 10/10 ==== round 13/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:15,  5.30it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:13<00:00,  5.35it/s]
episodes                                   56
episode_length                     172.160714
returns                            -51.150126
return_std                         119.342901
average_reward                      -0.294901
round_time             0 days 00:06:14.089296
episodes_test                            12.0
episode_length_test                824.583333
returns_test                       701.385806
return_std_test                    292.287882
average_reward_test                    0.8551
round_time_test        0 days 00:00:10.496797
round_time_total       0 days 00:06:14.090598
loss_total                         927.268625
loss_critic                       1234.721169
loss_actor                         -302.54164
memory_size                       823701.6755 

=== epoch 10/10 ==== round 14/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:06,  5.43it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:16<00:00,  5.32it/s]
episodes                                   55
episode_length                     173.454545
returns                            -49.472483
return_std                         117.189898
average_reward                      -0.288551
round_time             0 days 00:06:16.602759
episodes_test                            11.0
episode_length_test                882.181818
returns_test                       706.445507
return_std_test                    249.102326
average_reward_test                  0.807069
round_time_test        0 days 00:00:10.567693
round_time_total       0 days 00:06:16.604085
loss_total                         907.937413
loss_critic                       1210.616292
loss_actor                         -302.77819
memory_size                        825566.542 

=== epoch 10/10 ==== round 15/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.68it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   58
episode_length                      165.87931
returns                            -46.893209
return_std                         115.089183
average_reward                      -0.275672
round_time             0 days 00:06:19.568916
episodes_test                            12.0
episode_length_test                    787.25
returns_test                       542.841751
return_std_test                    247.353581
average_reward_test                   0.69535
round_time_test        0 days 00:00:10.718211
round_time_total       0 days 00:06:19.570256
loss_total                         910.185646
loss_critic                       1213.659154
loss_actor                        -303.708467
memory_size                       827419.2405 

=== epoch 10/10 ==== round 16/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:12,  4.60it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   56
episode_length                     167.482143
returns                            -44.003749
return_std                         110.294985
average_reward                      -0.268063
round_time             0 days 00:06:19.523005
episodes_test                            11.0
episode_length_test                843.636364
returns_test                       651.066462
return_std_test                    232.250887
average_reward_test                  0.765693
round_time_test        0 days 00:00:10.488965
round_time_total       0 days 00:06:19.524088
loss_total                         907.683013
loss_critic                       1210.578382
loss_actor                        -303.898546
memory_size                         829203.09 

=== epoch 10/10 ==== round 17/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<06:27,  5.15it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:18<00:00,  5.28it/s]
episodes                                   60
episode_length                     160.733333
returns                            -44.312665
return_std                         107.525374
average_reward                      -0.277453
round_time             0 days 00:06:19.299126
episodes_test                            11.0
episode_length_test                860.363636
returns_test                        735.82352
return_std_test                    256.277609
average_reward_test                  0.849379
round_time_test        0 days 00:00:10.617597
round_time_total       0 days 00:06:19.300213
loss_total                         919.397256
loss_critic                       1225.123665
loss_actor                        -303.508454
memory_size                       830936.1805 

=== epoch 10/10 ==== round 18/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:04,  4.12it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:17<00:00,  5.30it/s]
episodes                                   51
episode_length                     182.352941
returns                            -52.429303
return_std                         116.548382
average_reward                      -0.293651
round_time             0 days 00:06:18.150527
episodes_test                            13.0
episode_length_test                721.076923
returns_test                       530.722586
return_std_test                    258.027134
average_reward_test                  0.732521
round_time_test        0 days 00:00:10.630570
round_time_total       0 days 00:06:18.151602
loss_total                         924.161527
loss_critic                       1231.069819
loss_actor                        -303.471719
memory_size                        832830.692 

=== epoch 10/10 ==== round 19/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:01,  4.72it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   69
episode_length                     130.565217
returns                            -37.137034
return_std                          98.229678
average_reward                      -0.291954
round_time             0 days 00:06:22.409935
episodes_test                            11.0
episode_length_test                818.545455
returns_test                       638.529965
return_std_test                     273.55914
average_reward_test                  0.768926
round_time_test        0 days 00:00:10.697173
round_time_total       0 days 00:06:22.411220
loss_total                         923.759879
loss_critic                       1230.526899
loss_actor                        -303.308282
memory_size                       834545.2565 

=== epoch 10/10 ==== round 20/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:54,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:23<00:00,  5.22it/s]
episodes                                   68
episode_length                     134.926471
returns                            -38.214823
return_std                         100.201246
average_reward                      -0.286294
round_time             0 days 00:06:23.999143
episodes_test                            12.0
episode_length_test                    825.25
returns_test                       635.325152
return_std_test                    252.120773
average_reward_test                   0.77053
round_time_test        0 days 00:00:10.563981
round_time_total       0 days 00:06:24.000242
loss_total                         921.030067
loss_critic                       1227.292894
loss_actor                        -304.021333
memory_size                        836257.192 

=== epoch 10/10 ==== round 21/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:35,  4.38it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:21<00:00,  5.24it/s]
episodes                                   63
episode_length                     143.349206
returns                            -43.834389
return_std                         107.239371
average_reward                       -0.31917
round_time             0 days 00:06:22.104609
episodes_test                            11.0
episode_length_test                839.454545
returns_test                       657.818903
return_std_test                    244.910559
average_reward_test                  0.793798
round_time_test        0 days 00:00:10.628049
round_time_total       0 days 00:06:22.105741
loss_total                         927.692617
loss_critic                        1235.55242
loss_actor                        -303.746679
memory_size                       838061.3795 

=== epoch 10/10 ==== round 22/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:15,  5.31it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   67
episode_length                     132.597015
returns                              -38.3926
return_std                          97.633795
average_reward                      -0.304636
round_time             0 days 00:06:24.985561
episodes_test                            13.0
episode_length_test                     720.0
returns_test                       571.149081
return_std_test                    271.565012
average_reward_test                  0.793795
round_time_test        0 days 00:00:10.463930
round_time_total       0 days 00:06:24.986654
loss_total                         933.348707
loss_critic                        1242.49558
loss_actor                        -303.238874
memory_size                       839848.3245 

=== epoch 10/10 ==== round 23/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:24<00:00,  5.20it/s]
episodes                                   69
episode_length                     144.217391
returns                            -43.218824
return_std                         104.148501
average_reward                      -0.301032
round_time             0 days 00:06:24.878843
episodes_test                            12.0
episode_length_test                807.416667
returns_test                       593.121259
return_std_test                    240.749853
average_reward_test                   0.74057
round_time_test        0 days 00:00:10.562103
round_time_total       0 days 00:06:24.880388
loss_total                         908.429081
loss_critic                       1211.418883
loss_actor                        -303.530209
memory_size                       841681.6535 

=== epoch 10/10 ==== round 24/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:00,  4.15it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.19it/s]
episodes                                   57
episode_length                     167.473684
returns                            -48.025633
return_std                         112.778505
average_reward                      -0.285858
round_time             0 days 00:06:26.093790
episodes_test                            13.0
episode_length_test                704.384615
returns_test                       552.480984
return_std_test                    333.477451
average_reward_test                  0.778736
round_time_test        0 days 00:00:10.586384
round_time_total       0 days 00:06:26.094894
loss_total                          918.32452
loss_critic                       1223.810798
loss_actor                         -303.62067
memory_size                       843462.6655 

=== epoch 10/10 ==== round 25/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:42,  4.95it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:26<00:00,  5.18it/s]
episodes                                   63
episode_length                     143.380952
returns                            -43.334334
return_std                         103.734313
average_reward                       -0.31121
round_time             0 days 00:06:26.658847
episodes_test                            11.0
episode_length_test                848.090909
returns_test                       701.919749
return_std_test                    203.390977
average_reward_test                  0.846349
round_time_test        0 days 00:00:10.617951
round_time_total       0 days 00:06:26.659932
loss_total                          911.80927
loss_critic                       1215.671119
loss_actor                        -303.638211
memory_size                       845244.0925 

=== epoch 10/10 ==== round 26/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:03,  4.71it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:25<00:00,  5.18it/s]
episodes                                   62
episode_length                     159.580645
returns                            -49.074831
return_std                         114.223643
average_reward                      -0.307043
round_time             0 days 00:06:26.554177
episodes_test                            11.0
episode_length_test                859.363636
returns_test                       764.192458
return_std_test                    248.781119
average_reward_test                  0.895542
round_time_test        0 days 00:00:10.520091
round_time_total       0 days 00:06:26.555261
loss_total                         918.310083
loss_critic                         1223.7977
loss_actor                        -303.640466
memory_size                        847005.592 

=== epoch 10/10 ==== round 27/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<08:16,  4.01it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.14it/s]
episodes                                   63
episode_length                     149.222222
returns                             -44.01401
return_std                         109.582393
average_reward                      -0.296337
round_time             0 days 00:06:29.668350
episodes_test                            13.0
episode_length_test                707.538462
returns_test                       577.161876
return_std_test                    322.920538
average_reward_test                   0.83026
round_time_test        0 days 00:00:10.656319
round_time_total       0 days 00:06:29.669485
loss_total                         925.699624
loss_critic                        1233.08606
loss_actor                          -303.8462
memory_size                        848743.503 

=== epoch 10/10 ==== round 28/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:21,  5.22it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:27<00:00,  5.16it/s]
episodes                                   70
episode_length                     141.714286
returns                             -38.92811
return_std                         104.839316
average_reward                      -0.275252
round_time             0 days 00:06:28.433559
episodes_test                            16.0
episode_length_test                  585.6875
returns_test                       479.495526
return_std_test                    338.069006
average_reward_test                  0.811493
round_time_test        0 days 00:00:10.571110
round_time_total       0 days 00:06:28.434777
loss_total                         933.199467
loss_critic                       1242.513365
loss_actor                         -304.05621
memory_size                        850483.284 

=== epoch 10/10 ==== round 29/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:05,  4.68it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   66
episode_length                     136.333333
returns                            -33.493249
return_std                          99.383018
average_reward                      -0.259402
round_time             0 days 00:06:31.115134
episodes_test                            12.0
episode_length_test                    775.75
returns_test                       676.777852
return_std_test                    289.354802
average_reward_test                  0.874316
round_time_test        0 days 00:00:10.488175
round_time_total       0 days 00:06:31.116230
loss_total                         932.588347
loss_critic                       1241.907359
loss_actor                        -304.687785
memory_size                        852335.911 

=== epoch 10/10 ==== round 30/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:29<00:00,  5.13it/s]
episodes                                   54
episode_length                     166.481481
returns                            -41.300068
return_std                         113.548645
average_reward                      -0.257682
round_time             0 days 00:06:30.260893
episodes_test                            12.0
episode_length_test                     813.0
returns_test                       615.875879
return_std_test                    268.389224
average_reward_test                  0.756348
round_time_test        0 days 00:00:10.530200
round_time_total       0 days 00:06:30.262361
loss_total                         920.282976
loss_critic                       1226.726942
loss_actor                        -305.492974
memory_size                        854185.834 

=== epoch 10/10 ==== round 31/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.46it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:31<00:00,  5.11it/s]
episodes                                   52
episode_length                     175.115385
returns                            -39.552432
return_std                          111.22463
average_reward                      -0.232413
round_time             0 days 00:06:32.158945
episodes_test                            15.0
episode_length_test                617.533333
returns_test                       489.790061
return_std_test                    290.655302
average_reward_test                  0.776072
round_time_test        0 days 00:00:10.568896
round_time_total       0 days 00:06:32.160042
loss_total                         933.217368
loss_critic                       1242.822904
loss_actor                        -305.204857
memory_size                       856096.6055 

=== epoch 10/10 ==== round 32/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:12,  4.61it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.06it/s]
episodes                                   51
episode_length                     193.843137
returns                            -45.057895
return_std                         117.712304
average_reward                      -0.235362
round_time             0 days 00:06:35.475268
episodes_test                            10.0
episode_length_test                    1000.0
returns_test                       707.045604
return_std_test                     95.729237
average_reward_test                  0.707046
round_time_test        0 days 00:00:10.662330
round_time_total       0 days 00:06:35.476359
loss_total                         932.086954
loss_critic                       1241.434462
loss_actor                        -305.303168
memory_size                        857970.962 

=== epoch 10/10 ==== round 33/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 9/2000 [00:01<06:28,  5.12it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   37
episode_length                     221.216216
returns                            -54.972828
return_std                         133.503859
average_reward                      -0.271806
round_time             0 days 00:06:31.202760
episodes_test                            13.0
episode_length_test                764.153846
returns_test                       593.274792
return_std_test                     306.43638
average_reward_test                  0.779548
round_time_test        0 days 00:00:10.577261
round_time_total       0 days 00:06:31.203841
loss_total                         910.090698
loss_critic                       1214.217299
loss_actor                        -306.415793
memory_size                        859767.125 

=== epoch 10/10 ==== round 34/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:09,  4.64it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:30<00:00,  5.12it/s]
episodes                                   48
episode_length                       192.0625
returns                            -55.603218
return_std                         126.091313
average_reward                      -0.298304
round_time             0 days 00:06:31.243708
episodes_test                            10.0
episode_length_test                     901.0
returns_test                       739.737531
return_std_test                    136.097748
average_reward_test                  0.822166
round_time_test        0 days 00:00:10.756753
round_time_total       0 days 00:06:31.244813
loss_total                         915.067133
loss_critic                       1220.382184
loss_actor                        -306.193157
memory_size                       861626.7235 

=== epoch 10/10 ==== round 35/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:35,  5.04it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.06it/s]
episodes                                   47
episode_length                     194.531915
returns                            -57.432718
return_std                         129.900094
average_reward                      -0.299715
round_time             0 days 00:06:35.495554
episodes_test                            11.0
episode_length_test                842.727273
returns_test                       642.581461
return_std_test                    238.417066
average_reward_test                  0.755819
round_time_test        0 days 00:00:10.497664
round_time_total       0 days 00:06:35.497061
loss_total                         905.140578
loss_critic                       1208.060566
loss_actor                        -306.539463
memory_size                       863389.3695 

=== epoch 10/10 ==== round 36/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:14,  4.58it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:32<00:00,  5.10it/s]
episodes                                   53
episode_length                     177.528302
returns                            -53.568453
return_std                         126.105193
average_reward                      -0.307685
round_time             0 days 00:06:33.047046
episodes_test                            15.0
episode_length_test                     651.8
returns_test                       546.043261
return_std_test                    348.033652
average_reward_test                  0.842442
round_time_test        0 days 00:00:10.852933
round_time_total       0 days 00:06:33.048146
loss_total                         909.133891
loss_critic                       1213.137465
loss_actor                        -306.880493
memory_size                       865207.0935 

=== epoch 10/10 ==== round 37/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:26,  4.47it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:33<00:00,  5.08it/s]
episodes                                   42
episode_length                     211.119048
returns                            -65.685159
return_std                          130.65001
average_reward                      -0.325301
round_time             0 days 00:06:34.512399
episodes_test                            12.0
episode_length_test                    779.25
returns_test                       633.411522
return_std_test                    243.735156
average_reward_test                   0.81078
round_time_test        0 days 00:00:10.814349
round_time_total       0 days 00:06:34.513498
loss_total                         912.138451
loss_critic                       1216.610935
loss_actor                        -305.751569
memory_size                        867137.352 

=== epoch 10/10 ==== round 38/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:31,  5.09it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   51
episode_length                     185.392157
returns                            -55.214097
return_std                         118.405097
average_reward                      -0.301631
round_time             0 days 00:06:35.291211
episodes_test                            18.0
episode_length_test                545.444444
returns_test                       464.626153
return_std_test                    309.253201
average_reward_test                  0.858551
round_time_test        0 days 00:00:10.558847
round_time_total       0 days 00:06:35.292300
loss_total                         908.779427
loss_critic                       1212.753185
loss_actor                        -307.115683
memory_size                        868982.503 

=== epoch 10/10 ==== round 39/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:19,  5.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.07it/s]
episodes                                   38
episode_length                     240.447368
returns                            -71.970676
return_std                          132.15937
average_reward                      -0.301937
round_time             0 days 00:06:34.841831
episodes_test                            11.0
episode_length_test                826.909091
returns_test                       635.726069
return_std_test                     206.55994
average_reward_test                  0.763935
round_time_test        0 days 00:00:10.465839
round_time_total       0 days 00:06:34.843069
loss_total                          901.97873
loss_critic                       1204.132318
loss_actor                        -306.635709
memory_size                        870842.563 

=== epoch 10/10 ==== round 40/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:56,  4.79it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:35<00:00,  5.05it/s]
episodes                                   42
episode_length                     220.357143
returns                            -67.181691
return_std                         126.711452
average_reward                      -0.293988
round_time             0 days 00:06:36.451196
episodes_test                            16.0
episode_length_test                  596.3125
returns_test                       485.315748
return_std_test                    313.649773
average_reward_test                  0.820089
round_time_test        0 days 00:00:10.816120
round_time_total       0 days 00:06:36.452308
loss_total                         901.085746
loss_critic                       1203.105644
loss_actor                        -306.993922
memory_size                       872700.5665 

=== epoch 10/10 ==== round 41/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:58,  4.76it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   48
episode_length                     187.895833
returns                            -53.512815
return_std                          105.09663
average_reward                      -0.291929
round_time             0 days 00:06:37.122561
episodes_test                            12.0
episode_length_test                778.166667
returns_test                        568.57765
return_std_test                    212.092435
average_reward_test                  0.731544
round_time_test        0 days 00:00:10.594076
round_time_total       0 days 00:06:37.123638
loss_total                         912.843896
loss_critic                       1217.963577
loss_actor                        -307.634903
memory_size                       874475.9335 

=== epoch 10/10 ==== round 42/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:46,  4.90it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.01it/s]
episodes                                   60
episode_length                          150.6
returns                            -42.423973
return_std                          94.116477
average_reward                      -0.290019
round_time             0 days 00:06:39.456664
episodes_test                            13.0
episode_length_test                735.307692
returns_test                       588.974396
return_std_test                     307.73749
average_reward_test                  0.800135
round_time_test        0 days 00:00:10.535876
round_time_total       0 days 00:06:39.457961
loss_total                         908.498978
loss_critic                       1212.437029
loss_actor                        -307.253305
memory_size                       876203.7105 

=== epoch 10/10 ==== round 43/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:53,  4.81it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:34<00:00,  5.06it/s]
episodes                                   52
episode_length                     180.134615
returns                            -55.037197
return_std                         110.976345
average_reward                      -0.307445
round_time             0 days 00:06:35.462774
episodes_test                            14.0
episode_length_test                653.285714
returns_test                       512.987786
return_std_test                    293.159683
average_reward_test                  0.801122
round_time_test        0 days 00:00:10.612685
round_time_total       0 days 00:06:35.463855
loss_total                         921.113531
loss_critic                       1228.157504
loss_actor                        -307.062439
memory_size                       878035.2035 

=== epoch 10/10 ==== round 44/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:52,  4.83it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:36<00:00,  5.04it/s]
episodes                                   51
episode_length                     161.823529
returns                            -48.391952
return_std                         108.949064
average_reward                      -0.309373
round_time             0 days 00:06:37.443252
episodes_test                            13.0
episode_length_test                728.461538
returns_test                       545.297697
return_std_test                    294.731052
average_reward_test                  0.741357
round_time_test        0 days 00:00:10.751987
round_time_total       0 days 00:06:37.444356
loss_total                         910.204452
loss_critic                        1214.43578
loss_actor                         -306.72093
memory_size                       879954.4395 

=== epoch 10/10 ==== round 45/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:20,  4.53it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   51
episode_length                     183.568627
returns                            -58.194383
return_std                         120.924819
average_reward                       -0.31915
round_time             0 days 00:06:38.785830
episodes_test                            12.0
episode_length_test                     817.5
returns_test                       674.983554
return_std_test                    294.079473
average_reward_test                  0.826748
round_time_test        0 days 00:00:10.697997
round_time_total       0 days 00:06:38.786929
loss_total                         886.866185
loss_critic                       1185.152286
loss_actor                        -306.278302
memory_size                        881835.796 

=== epoch 10/10 ==== round 46/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<07:47,  4.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   52
episode_length                     180.076923
returns                            -55.813639
return_std                         127.015456
average_reward                      -0.308839
round_time             0 days 00:06:38.792388
episodes_test                            11.0
episode_length_test                831.727273
returns_test                       644.485139
return_std_test                    283.439269
average_reward_test                   0.79282
round_time_test        0 days 00:00:10.524205
round_time_total       0 days 00:06:38.793846
loss_total                          902.00637
loss_critic                       1204.305588
loss_actor                        -307.190584
memory_size                       883600.8895 

=== epoch 10/10 ==== round 47/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:48,  4.26it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:38<00:00,  5.02it/s]
episodes                                   42
episode_length                      218.52381
returns                            -68.845375
return_std                         132.435919
average_reward                      -0.324733
round_time             0 days 00:06:38.771129
episodes_test                            15.0
episode_length_test                648.666667
returns_test                       501.213061
return_std_test                    265.866329
average_reward_test                  0.779374
round_time_test        0 days 00:00:10.708928
round_time_total       0 days 00:06:38.772461
loss_total                         905.765233
loss_critic                       1209.001152
loss_actor                        -307.178525
memory_size                        885380.427 

=== epoch 10/10 ==== round 48/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:06,  4.67it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   49
episode_length                     189.612245
returns                            -59.573734
return_std                         125.468639
average_reward                      -0.322585
round_time             0 days 00:06:39.874009
episodes_test                            11.0
episode_length_test                884.727273
returns_test                       784.965238
return_std_test                     238.05302
average_reward_test                  0.892855
round_time_test        0 days 00:00:10.808184
round_time_total       0 days 00:06:39.875132
loss_total                         911.908872
loss_critic                       1216.879357
loss_actor                        -307.973155
memory_size                       887210.5605 

=== epoch 10/10 ==== round 49/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 7/2000 [00:01<07:10,  4.63it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
episodes                                   58
episode_length                     169.155172
returns                            -52.634519
return_std                         124.583787
average_reward                       -0.31088
round_time             0 days 00:06:40.172548
episodes_test                            10.0
episode_length_test                     903.9
returns_test                       700.716762
return_std_test                    257.176092
average_reward_test                  0.767095
round_time_test        0 days 00:00:10.600587
round_time_total       0 days 00:06:40.173651
loss_total                         896.584329
loss_critic                       1197.732487
loss_actor                        -308.008386
memory_size                       889075.0415 

=== epoch 10/10 ==== round 50/50 =====================================
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
  0%|          | 8/2000 [00:01<06:55,  4.80it/s]/<ANONYMIZED PATH>/python3.12/site-packages/gymnasium/envs/registration.py:519: DeprecationWarning: [33mWARN: The environment Ant-v4 is out of date. You should consider upgrading to version `v5`.[0m
  logger.deprecation(
100%|██████████| 2000/2000 [06:39<00:00,  5.01it/s]
<MM1_Delay<NormalizeActionWrapper<Float64ToFloat32<CompatWrapper<NoisyActionWrapper<TimeLimit<OrderEnforcing<PassiveEnvChecker<AntEnv<Ant-v4>>>>>>>>>>
episodes                                   59
episode_length                     150.338983
returns                            -45.472959
return_std                         116.266297
average_reward                      -0.310481
round_time             0 days 00:06:39.709952
episodes_test                            11.0
episode_length_test                819.454545
returns_test                       613.666733
return_std_test                    267.659036
average_reward_test                  0.741978
round_time_test        0 days 00:00:10.697741
round_time_total       0 days 00:06:39.711048
loss_total                         899.152512
loss_critic                       1200.865332
loss_actor                         -307.69885
memory_size                        890817.066 


