initial performance: 5792
episode: 0 training return: tensor(1.5431, device='cuda:0', grad_fn=<AddBackward0>)
episode: 1 training return: tensor(0.0994, device='cuda:0', grad_fn=<AddBackward0>)
episode: 2 training return: tensor(2.8677e-07, device='cuda:0', grad_fn=<AddBackward0>)
episode: 3 training return: tensor(9.4382e-13, device='cuda:0', grad_fn=<AddBackward0>)
epoch: 1 test_true_pfm: 162 test_simulate_pfm tensor(1.6746e-10, device='cuda:0', grad_fn=<DivBackward0>)
