Logging to experiments/direct_ft_bic/
creating model...
input shape change: torch.Size([256, 6, 3, 3])
loading model from: ../models/256_adm_uncond.pt complete.
creating data loader...
training...
-------------------------
| grad_norm  | 0.0222   |
| loss       | 0.00937  |
| loss_q0    | 0.0266   |
| loss_q1    | 0.00293  |
| loss_q2    | 0.00123  |
| loss_q3    | 6.81e-05 |
| mse        | 0.00921  |
| mse_q0     | 0.0261   |
| mse_q1     | 0.00291  |
| mse_q2     | 0.00122  |
| mse_q3     | 6.73e-05 |
| param_norm | 1.36e+03 |
| samples    | 256      |
| step       | 0        |
| vb         | 0.000164 |
| vb_q0      | 0.000497 |
| vb_q1      | 2.17e-05 |
| vb_q2      | 1.08e-05 |
| vb_q3      | 7.92e-07 |
-------------------------
-------------------------
| grad_norm  | 0.0325   |
| loss       | 0.0119   |
| loss_q0    | 0.0423   |
| loss_q1    | 0.00615  |
| loss_q2    | 0.00118  |
| loss_q3    | 8.19e-05 |
| mse        | 0.0116   |
| mse_q0     | 0.0409   |
| mse_q1     | 0.0061   |
| mse_q2     | 0.00117  |
| mse_q3     | 8.09e-05 |
| param_norm | 1.36e+03 |
| samples    | 2.82e+03 |
| step       | 10       |
| vb         | 0.000347 |
| vb_q0      | 0.00143  |
| vb_q1      | 4.51e-05 |
| vb_q2      | 1.03e-05 |
| vb_q3      | 9.6e-07  |
-------------------------
