Logging to experiments/direct_ft_bic/
creating model...
input shape change: torch.Size([256, 6, 3, 3])
loading model from: ../models/256_adm_uncond.pt complete.
creating data loader...
training...
-------------------------
| grad_norm  | 0.0222   |
| loss       | 0.0126   |
| loss_q0    | 0.0497   |
| loss_q1    | 0.00511  |
| loss_q2    | 0.000599 |
| loss_q3    | 0.000103 |
| mse        | 0.0124   |
| mse_q0     | 0.0486   |
| mse_q1     | 0.00507  |
| mse_q2     | 0.000593 |
| mse_q3     | 0.000102 |
| param_norm | 1.36e+03 |
| samples    | 256      |
| step       | 0        |
| vb         | 0.000245 |
| vb_q0      | 0.00106  |
| vb_q1      | 3.69e-05 |
| vb_q2      | 5.34e-06 |
| vb_q3      | 1.18e-06 |
-------------------------
saving model 0...
saving model 0.9999...
-------------------------
| grad_norm  | 0.0325   |
| loss       | 0.00988  |
| loss_q0    | 0.0356   |
| loss_q1    | 0.00592  |
| loss_q2    | 0.00122  |
| loss_q3    | 7.59e-05 |
| mse        | 0.00947  |
| mse_q0     | 0.0338   |
| mse_q1     | 0.00588  |
| mse_q2     | 0.00121  |
| mse_q3     | 7.5e-05  |
| param_norm | 1.36e+03 |
| samples    | 2.82e+03 |
| step       | 10       |
| vb         | 0.000415 |
| vb_q0      | 0.00178  |
| vb_q1      | 4.33e-05 |
| vb_q2      | 1.05e-05 |
| vb_q3      | 9e-07    |
-------------------------
