%%------------------------------------------------SECTION 2 ------------------------------------------------%%

\section{Related Work}
\label{sec:ap1}
In the realm of self-supervised learning-based MRI reconstruction, among the first works introduced was SSDU (Self-supervised learning via data undersampling) \cite{Yaman2020}. SSDU, inspired by SSL concepts from deep learning, particularly Noise2Self \cite{batson2019noise2self}, proposed training a reconstruction model (ResNet CNN with conjugate gradient formulation) by partitioning the undersampled data into two subsets. One subset served as input, and the other as the target, with the loss estimated in the $k$-space domain.

An extension of SSDU was proposed in a parallel network framework \cite{Hu2021}, where two networks were trained on each partition of the subsampled $k$-space data. A consistency loss minimized the discrepancy between the two networks' outputs, allowing either network to be used during inference since both networks were trained to produce consistent results.

Further building on SSDU, \cite{millard2023} introduced a Noisier2Noise framework,  where a second subsampling mask was applied to the already subsampled $k$-space data. The employed network, E2EVarNet \cite{Sriram2020}, was trained to recover singly subsampled data from the doubly subsampled version, showing that SSDU is a special case of this broader method.  Furthermore, \cite{millard2023} provided theoretical justifications for SSDU.

In the realm of diffusion-based MRI reconstruction, a fully-sampled-data-free score-based diffusion model was proposed in \cite{cui2022selfscore}, where the model learned the prior of fully-sampled images from subsampled data in a self-supervised manner. Another diffusion-based approach, SSDiffRecon \cite{korkmaz2023self}, integrated cross-attention transformers with data-consistency blocks in an unrolled architecture. However, diffusion-based methods are outside the scope of our work.

Following the SSDU subsampled data splitting, in \cite{yan2023dc} the authors present DC-SiamNet, which employs two branches with shared weights in a Siamese architecture. Each branch reconstructs from a partition of the $k$-space data, and the training is guided by a dual-domain loss that includes image and frequency domain consistency which ensure reconstructed images/$k$-spaces are consistent across partitions, along with contrastive loss in the latent space. 

A more recent work extended SSDU by introducing SPICER, which includes coil sensitivity estimation based on autocalibration signal (ACS) data and utilizes U-Net-based models for both sensitivity estimation and reconstruction \cite{hu2024spicerselfsupervisedlearningmri}. Similar sensitivity estimation was also employed in \cite{millard2023} within the E2EVarNet framework.


Finally, SSDU has also been applied to reconstruct non-Cartesian MRI data, with the subsampled $k$-space split into disjoint parts \cite{ZHOU2022102538}. In this approach, a variational network is trained using a dual-domain loss similar to \cite{yan2023dc}: frequency consistency ensures that reconstructed $k$-spaces from each partition match the input data, while image consistency ensures that the reconstructed images are consistent across partitions. Additionally, loss is computed by comparing the reconstructed $k$-spaces and images from each partition with those generated when subsampled data is used as input.

Most self-supervised MRI reconstruction methods can be seen as derivatives or extensions of SSDU, with partitioning of undersampled data into disjoint subsets as the fundamental idea. This partitioning approach underpins the SSL component of our method, and without loss of generality, SSDU can be considered a representative method in this domain. While recent techniques have incorporated different architectures or loss functions, they largely build upon this core strategy.

Our proposed method, Joint Supervised and Self-Supervised Learning, draws inspiration from these aforementioned approaches. Like most SSL-based methods, it seeks to overcome the challenge of training without fully-sampled $k$-space data for the target organ domain. However, JSSL extends the applicability of such techniques by leveraging fully-sampled data from proxy datasets while incorporating subsampled data from the target domain. This enables joint training through both supervised and self-supervised learning, providing a practical solution for scenarios where ground truth fully-sampled data is inaccessible, yet allowing for improved reconstruction performance through the combination of proxy and target datasets.

In the broader context of combining supervised and self-supervised learning, Noise2Recon \cite{desai2021noise2recon} extended SSDU by leveraging both fully-sampled and subsampled data within a single organ domain for reconstruction and denoising, using the E2EVarNet model \cite{Sriram2020}. However, this method's dependency on fully-sampled data restricts its applicability in scenarios where such data is unavailable.

Another recent approach utilized paired fully-sampled and subsampled data from different modalities for reconstruction of the target modality \cite{zhou2022dsformer}. While SSL was employed for training, this method still relied on fully-sampled data during both training and inference, which contrasts with our approach that focuses on cases where fully-sampled data is unavailable for the target domain.

Lastly, test-time training \cite{darestani2022testtimetrainingclosenatural} is a recent method proposed to handle domain shifts in MRI reconstruction. By re-training models at inference times using a SSL data-consistency loss, it aims to adjust to shifts in data distribution between training and testing, such as moving from one scanner to another. However, this technique operates at inference time, which limits its utility in real-time imaging applications.
%%------------------------------------------------SECTION 2 ------------------------------------------------%%