Abstract: Highlights•We propose a new cross-modal decoding framework for brain-driven image reconstruction.•We leverage StyleGAN inversion to enable realistic visual quality of reconstructions.•We disentangle face representation to make hierarchical multi-modal semantic alignment.•We design progressive refinement learning to iteratively refine recovery accuracy.•We introduce contrastive loss and multi-layer identity loss to improve ID consistency.
Loading