[2025-11-30 11:17:33] Namespace(method='FedNewton', training_mode='head', dataset='cifar100', model_name='vit_base_patch14_dinov2.lvd142m', seed=1, rounds=50, num_clients=10, fraction=0.5, local_epochs=1, batch_size=64, lr=0.01, newton_lr=0.01, sophia_lr=0.05, rho=0.04, betas='0.9,0.99', mu=0.01, log_dir='runs/compare_solvers_20251130_090044_vit_base_patch14_dinov2.lvd142m_head/cifar100/logs/exact', optimizer='sgd', hessian_batches=64, damping=1e-05, alpha=0.5, max_norm=0, newton_solver='exact', newton_cg_max_iter=10, lbfgs_m=5)
[2025-11-30 11:17:33] Start: FedNewton | Mode: head | Dataset: cifar100 | Solver: exact
Traceback (most recent call last):
  File "/root/autodl-tmp/main.py", line 956, in <module>
    s_j, _ = clients[idx].compute_newton_step(copy.deepcopy(global_weights), global_grads, lbfgs_state)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/autodl-tmp/main.py", line 596, in compute_newton_step
    if curr_b is not None: H_tuple = torch.autograd.functional.hessian(head_loss_func, inputs)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/torch/autograd/functional.py", line 958, in hessian
    res = jacobian(
          ^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/torch/autograd/functional.py", line 788, in jacobian
    vj = _autograd_grad(
         ^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/torch/autograd/functional.py", line 194, in _autograd_grad
    return torch.autograd.grad(
           ^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/torch/autograd/__init__.py", line 502, in grad
    result = _engine_run_backward(
             ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/torch/autograd/graph.py", line 824, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 0 has a total capacity of 31.36 GiB of which 3.06 MiB is free. Including non-PyTorch memory, this process has 31.35 GiB memory in use. Of the allocated memory 27.40 GiB is allocated by PyTorch, and 3.35 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
