\documentclass{midl} % Include author names
%\documentclass[anon]{midl} % Anonymized submission

% The following packages will be automatically loaded:
% jmlr, amsmath, amssymb, natbib, graphicx, url, algorithm2e
% ifoddpage, relsize and probably more
% make sure they are installed with your latex distribution
\usepackage{graphicx}
% \usepackage{caption}
\usepackage{amsmath,amssymb} % define this before the line numbering.
\usepackage{color}
\usepackage{multirow}
% \usepackage{tabularx,ragged2e}
\usepackage{booktabs}
% \usepackage{subfig}
% \usepackage{subcaption}
% \captionsetup{compatibility=false}
\DeclareMathOperator*{\argmin}{arg\,min}
\usepackage{mwe} % to get dummy images
\jmlrvolume{-- Under Review}
\jmlryear{2020}
\jmlrworkshop{Full Paper -- MIDL 2020}
\editors{Under Review for MIDL 2020}

\title{Towards multi-sequence MR image recovery from undersampled k-space data}

 % Use \Name{Author Name} to specify the name.
 % If the surname contains spaces, enclose the surname
 % in braces, e.g. \Name{John {Smith Jones}} similarly
 % if the name has a "von" part, e.g \Name{Jane {de Winter}}.
 % If the first letter in the forenames is a diacritic
 % enclose the diacritic in braces, e.g. \Name{{\'E}louise Smith}

 % Two authors with the same address
 % \midlauthor{\Name{Author Name1} \Email{abc@sample.edu}\and
 %  \Name{Author Name2} \Email{xyz@sample.edu}\\
 %  \addr Address}

 % Three or more authors with the same address:
 % \midlauthor{\Name{Author Name1} \Email{an1@sample.edu}\\
 %  \Name{Author Name2} \Email{an2@sample.edu}\\
 %  \Name{Author Name3} \Email{an3@sample.edu}\\
 %  \addr Address}


% Authors with different addresses:
% \midlauthor{\Name{Author Name1} \Email{abc@sample.edu}\\
% \addr Address 1
% \AND
% \Name{Author Name2} \Email{xyz@sample.edu}\\
% \addr Address 2
% }

%\footnotetext[1]{Contributed equally}

% More complicate cases, e.g. with dual affiliations and joint authorship

\midlauthor{\Name{Cheng Peng} \Email{cp4653@umd.edu}\\
\Name{Wei-An Lin} \Email{walin@umd.edu}\\
\Name{Rama Chellappa} \Email{rama@umiacs.umd.edu}\\
\addr University of Maryland, College Park, USA \\
\AND
\Name{S. Kevin Zhou} \Email{s.kevin.zhou@gmail.com}\\
\addr Chinese Academy of Sciences\\
Peng Cheng Laboratory, Shenzhen, China
}

% \midlauthor{\Name{Cheng Peng} \Email{cp4653@umd.edu}\\
% \Name{Wei-An Lin} \Email{walin@umd.edu}\\
% \Name{Rama Chellappa} \Email{rama@umiacs.umd.edu}\\
% \addr University of Maryland, College Park, USA \\
% \AND
% \Name{S. Kevin Zhou} \Email{s.kevin.zhou@gmail.com}\\
% \addr Chinese Academy of Sciences\\
% Peng Cheng Laboratory, Shenzhen, China

% }

\begin{document}

\maketitle

\begin{abstract}
Undersampled MR image recovery has been widely studied with Deep Learning methods as a post-processing step for accelerating MR acquisition. In this paper, we aim to optimize multi-sequence MR image recovery from undersampled k-space data under an overall time constraint. We first formulate it as a {\it constrained optimization} problem and show that finding the optimal sampling strategy for all sequences and the optimal recovery model for such sampling strategy is {\it combinatorial} and hence computationally prohibitive. To solve this problem, we propose a {\it blind recovery model} that simultaneously recovers multiple sequences, and an efficient approach to find proper combination of sampling strategy and recovery model. Our experiments demonstrate that the proposed method outperforms sequence-wise recovery, and sheds light on how to decide the undersampling strategy for sequences within an overall time budget.
\end{abstract}

\begin{keywords}
Magnetic Resonance, Image Recovery, Multi-Modal
\end{keywords}

\section{Introduction}

Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique. It holds several distinct advantages over other imaging modalities such as computed tomography (CT) and ultrasound. Not ony does MRI resolve tissues at a high quality, it can also be customized with different pulse sequences to produce a variety of desired contrasts that reveal different kinds of tissues, such as blood vessels and tumor regions. Furthermore, compared to CT, MRI does not expose patients to ionizing radiation. On the other hand, MRI is limited by its long acquisition time, as the data is acquired by traversing through k-space, where the speed of traversal is limited by the underlying MR physics and machine quality. In practice, patients often take multiple MR sequences, each of which uses different parameters to target specific tissues and lesions, resulting in even longer overall acquisition time. This leads to various practical problems, ranging from image blurriness due to patient movement to limiting accessibility of the machines.

There is a long history of research on how to undersample MR k-space data while maintaining image quality. Lustig et al.~\cite{lustig2007sparse} first proposed to use Compressed Sensing in MRI (CSMRI), assuming that the undersampled MR images have a sparse representation in some transform domain, where noise can be discarded through minimizing the $\mathcal{L}_{0}$ norm of such representation. This method  was shown to yield much better results than zero-filling the missing k-space samples (ZF); Extending on CSMRI, Ravishankar et al.~\cite{DBLP:journals/tmi/RavishankarB11} applied more adaptive sparse modelling through Dictionary Learning, where the transformation is optimized through specific sets of data, resulting in better sparsity encoding. To further explore redundancy within the MR data, new methods have been proposed in recent years \cite{DBLP:conf/miccai/HuangCA12,hirabayashi2015compressed,senel2019statistically,DBLP:journals/tmi/GozcuMLICSC18,gong2015promise}, focusing on extrapolating information in adjacent slices, in multi-acquisition scenarios, and in scenarios where additional sequence is available. 
In the domain of Deep Learning, Schlemper et al. \cite{DBLP:journals/tmi/SchlemperCHPR18} proposed a cascade of CNNs that incorporates data consistency layers to de-noise MRI in image domain while maintaining consistency in the k-space, and showed that the results significantly outperformed DLMRI~\cite{DBLP:journals/tmi/RavishankarB11}. Yang et al.~\cite{DBLP:journals/tmi/YangYDSDYLAKGF18} proposed DAGAN, which recovers undersampled MR images through a U-Net structure with perceptual and adversarial loss in addition to $L_{1}$ loss in image space and frequency space. Quan et al. \cite{DBLP:journals/tmi/QuanNJ18} proposed RefineGAN, which performs reconstruction and refinement through two different networks, and enforces a cyclic loss in the image and frequency spaces. 

Although the mentioned CNN-based methods have obtained impressive results, they focus on single sequence reconstruction. Few studies have explored the effectiveness of CNN-based methods under multi-sequence scenarios, which are common in practice and shown to contribute in non-learning-based methods \cite{gong2015promise,bilgic2018improving}. Xiang et al.~\cite{DBLP:conf/miccai/XiangCCZLWS18} showed that a highly undersampled $T_2$ sequence, given a fully sampled $T_1$ sequence, can be well recovered through a Dense U-Net. In this paper, we attempt to find the best strategy at undersampling k-space acquisition over multiple sequences, such that we can best recover the sequences post-acquisition. Wang et al.\cite{wang2017feasibility} also explored the feasibility of multi-constrast MR imaging through CNN models.

% a quantitative study done with regard to the best strategy at undersampling k-spaces over multiple sequences for image recovery


The contributions of our paper can be summarized as follows:
(i) we formulate a {\it combinatorial constrained optimization} problem, where given a limited acquisition time, we seek to find the best strategy to undersample the k-spaces of multiple sequences to achieve the best overall recovery;
(ii) we propose a novel CNN-based {\it blind recovery model} that extrapolates the shared information across different sequences and simultaneously recover them, as well as an efficient approach to finding a proper combination of sampling strategy and recovery model;
(iii) we perform extensive evaluation on real and simulated k-space data, which shows that the proposed model outperforms the method of independently recovering each sequence, and that our method finds {\it the undersampling strategy adaptive to the given sequences}.
% Acknowledgments---Will not appear in anonymized version
% \midlacknowledgments{We thank a bunch of people.}
\section{Problem Formulation}
We first note that the most popular MR k-space sampling method is through Cartesian trajectory, where a series of acquisitions is performed along equally-spaced parallel lines, which are conventionally called {\em phase encoding lines}. This leads to a natural implementation for MR undersampling, where the technicians can drop certain phase encoding lines from the sampling grid~\cite{lustig2007sparse}. In this paper, we focus on undersampling with 1D masks along the phase encoding direction\footnotemark. \footnotetext{We have found that undersampling with 2D masks generally leads to better recovery quality; however, such a setting is less time efficient in practice.}%There are other possible sampling trajectories and undersampling patterns; however, they are not the focus here. For those scenarios, the overall formulation still stands true and only needs slight modification.

Consider multiple MR sequences with full k-space spectrums $\{F_s\}_{s=1}^S$, where $S$ denotes the total number of sequences, with each spectrum sampled by $N$ phase encoding lines. For each $F_s$, the unit time for sampling a phase encoding line is denoted by $t_s$. We define 1D sampling masks $\mathcal{M}_s \in \{0,1\}^{N}$ which selects a subset of encoding lines $\mathcal{M}_s \odot F_s$ for faster acquisition. By applying the inverse Fourier transform $\mathcal{F}^{-1}$, an undersampled MR image for sequence $s$ is reconstructed as 
\begin{equation}
I_{M_s} = \mathcal{F}^{-1}(\mathcal{M}_s \odot F_s).
\end{equation}
When fully sampled, the MR image is reconstructed by $I_s = \mathcal{F}^{-1}(F_s)$. If we denote the number of selected encoding lines by $|\mathcal{M}_s|$, the total time needed to acquire all the sequences is
%\begin{equation}
$T=\sum_{s=1}^{S} t_{s}\times |\mathcal{M}_s|$.
%\end{equation}

Undersampled MR leads to faster acquisition and degraded quality compared to fully sampled MR. To allow fast acquisition while retaining image quality, we apply a deep neural network as the post-processing step to improve the degraded image quality. Therefore, we consider the problem of searching for an optimal sampling strategy $\{\mathcal{M}_s\}_{s=1}^S$ and a CNN $f_\theta$ that best recovers fully sampled $\{I_s\}_{s=1}^S$ from $\{I_{\mathcal{M}_s}\}$ with a time constraint $T \leq T_{max}$. This constrained optimization problem can be formulated as follows:
\begin{equation} \label{eq:opt}
    \min_{\mathcal{\theta}, \{\mathcal{M}_{s}\}} \sum_{s=1}^S E_{I_{s} \sim p(I_s)} \left[\big\lVert f_{\theta}(I_{\mathcal{M}_{s}}) - I_{s} \big\rVert_1 \right] ~~\text{s.t.}~~ \sum_{s=1}^{S} t_{s}|\mathcal{M}_s| \leq T_{max}.
\end{equation}
We use the $L_1$ loss in (\ref{eq:opt}); however, other loss functions can be used too.

The problem defined in \eqref{eq:opt} is {\em combinatorial} in nature, as has been realized by Reeves et al. \cite{370637}. First, the set $\{\mathcal{M}_s\}_{s=1}^S$ has a total of $2^{NS}$ possible combinations. Secondly, the best recovery model depends on the choice of sampling strategy. As a result, the optimal solution to \eqref{eq:opt} is in general difficult to find. As a preliminary attempt, we assume a fixed candidate set $\mathcal{C} \in \{m_1, \ldots, m_C\}$ for each $\mathcal{M}_s$. The number of possible sampling strategies becomes $C^S$ instead. However, even with the simplification, a straightforward approach to \eqref{eq:opt}, which is 
\begin{equation} \label{eq:direct}
    \min_{\mathcal{M}_{1:S} \in \mathcal{C}^S} \left( \min_{\mathcal{\theta}} \sum_{s=1}^S E_{I_{s} \sim p(I_s)} \left[\big\lVert f_{\theta}(I_{\mathcal{M}_{s}}) - I_{s} \big\rVert_1 \right] \right) ~~\text{s.t.}~~ \sum_{s=1}^{S} t_{s}|\mathcal{M}_s| \leq T_{max},
\end{equation}
still requires training $C^S$ models and then choosing the one with minimum loss. This is necessary since each model is trained to best eliminate noise introduced by the specific $\mathcal{M}_s$, and becomes sub-optimal when the noise level/pattern is changed.

In this work, we propose an efficient approach that finds a $(\theta, \{\mathcal{M}_s\}_{s=1}^S)$ while circumventing the computational cost in training an excessive number of models. Conceptually, we propose to first train a blind recovery model (BRM), which takes randomly undersampled MR sequences as inputs, and recovers them to fully sampled MR sequences. The trained BRM can then be used as an MR sequence quality estimator to search for the optimal sampling strategy $\{\mathcal{M}^*_s\}_{s=1}^S$. Finally, with $\{\mathcal{M}^*_s\}_{s=1}^S$, we can proceed to solve \eqref{eq:direct} by fine-tuning on the existing BRM. In total, the proposed method only requires training {\it one} CNN, which significantly reduces the computational cost.

\subsection{Blind recovery model}
A blind recovery model (BRM) is a CNN $f_{\theta}$ which recovers $I_s$ by fusing information from different undersampled MR sequences $\{I_{\mathcal{M}_s}\}_{s=1}^S$, $\mathcal{M}_s \in \mathcal{C}$.
We adopt a data augmentation approach, which randomly selects sampling masks from $\mathcal{C}$, and consider the following \emph{unconstrained optimization problem}:
\begin{equation} \label{eq:step1}
    \theta^* = \argmin_{\theta} \sum_{s=1}^S E_{I_{s} \sim p(I_s), \mathcal{M}_s \sim p(\mathcal{C})}  \left[\big\lVert f_{\theta}(I_{\mathcal{M}_{s}}) - I_{s} \big\rVert_1 \right].
\end{equation}
 As we will show, the model trained under this scheme sacrifices its ability to fit on a specific sampling profile, and in exchange performs generally well across all sampling profiles. Therefore, it can serve as a good estimator for discovering the best sampling strategy.
%Our intuition is that for MR sequences, the more structural information discarded through undersampling, the more difficult it is for a CNN to recover $I$ from $I_m$, which leads to larger reconstruction loss.
\subsection{Sampling strategy searching}
Given a trained BRM $f_{\theta^*}$, we propose to search for the optimal sampling strategy by finding the one with a minimum loss:
\begin{equation} \label{eq:step2}
    \mathcal{M}_{1:S}^* = \argmin_{\mathcal{M}_{1:S}} \sum_{s=1}^S E_{I_{s} \sim p(I_s)} \left[\big\lVert f_{\theta^*}(I_{\mathcal{M}_{s}}) - I_{s} \big\rVert_1 \right] ~\text{s.t.}~ \sum_{s=1}^{S} t_{s}|\mathcal{M}_s| \leq T_{max}.
\end{equation}
The above exhaustive search requires $C^S$ forward passes, which is significantly less computationally heavy than training $C^S$ CNNs.
The solution $\theta^*$ can be further improved by learning a refined model specific to $\mathcal{M}^*_{s}$:
\begin{align} \label{eq:step3}
    \hat{\theta} = \argmin_{\theta} \sum_{s=1}^S E_{I_{s} \sim p(I_s)} \left[\big\lVert f_{\theta}(I_{\mathcal{M}^*_{s}}) - I_{s} \big\rVert_1 \right].
\end{align}

\subsection{Single sequence training vs multi-sequence training}

One has the option of training (a) multiple SISO (single input single output) BRMs, one per sequence, or (b) one monolithic MIMO (multiple input multiple output) BRM for all sequences. The latter option holds several advantages over the former. First, option (a) does not consider the complementary information across different sequences. As shown in \cite{DBLP:conf/miccai/XiangCCZLWS18,DBLP:conf/miccai/HuangCA12}, there exists a strong correlation between sequences of the same patient, as they share the underlying anatomical structures. If a particular sequence is severely undersampled, leading to the loss of some anatomical detail, such information may be present in other less severely undersampled sequences. Secondly, option (b) only requires training one model, while option (a) requires $S$ models. As all the models attempt to eliminate distortions due to undersampling, they should learn similar features. However, multi-sequence training requires the sequences to be aligned amongst themselves, which may require coordination with the patient or proper registration algorithms.

%So, the models in option (a) either share similar features, leading to inefficiency, or learn features tuned to particular sequences, leading to less generality.

\subsection{Network architecture}
Our multi-sequence simultaneous recovery approach is shown in Fig $\ref{fig:MSR}$. The approach is based on Residual Dense Block (RDB) \cite{DBLP:journals/corr/abs-1802-08797}, which incorporates the idea of residual learning and dense block \cite{DBLP:journals/corr/HuangLW16a}, allowing all layers of features to be seen directly by other layers. %It has been shown that RDB helps achieve state-of-the-art performance in the domain of Super-Resolution. Since MR recovery aims at eliminating noise caused by undersampling, we believe that an RDB-based framework can be effective. 
During learning, each raw k-space data $F_s$ first gets undersampled through a randomly generated mask $\mathcal{M}_{s}$. The results are then transformed from k-space to image space, and concatenated before sent to the recovery network, which outputs $I^{R}_{1:S}$. The loss function is defined as the following:
%\begin{align}
$\mathcal{L} = \lVert I^{R}_{1:S} - I_{1:S} \rVert_1$.
%\end{align}

%(possibly expand more on network, including residual, global feature fusion etc.)

\begin{figure}[t]
\centering
\includegraphics[width=1.0\textwidth]{multi_seq.png}
\caption{Multi-sequence recovery (MIMO) pipeline with the masks $M_s$ randomly selected. SISO pipeline is implemented similarly with a single sequence and a single output.}\label{fig:MSR}
\end{figure}


\section{Experiments}
\subsection{Datasets}
We employ two datasets. The first one is a privately collected, k-space raw data of three sequences ($T_1$, $T_2$, FLAIR) from 20 patients, with each sequence containing 18 slices. The sequences are co-registered and taken with an MRI machine with 8 channels; in order to augment training, we treat each channel as an individual image to result in a total of 2,880 three-sequence images, which are divided into a ratio of 17:1:2 for training, validation, and testing. We refer to this dataset as ``real data".
In order to further validate our research, we also employ the Brain Tumor Image Segmentation (BraTS) dataset \cite{DBLP:journals/tmi/MenzeJBKFKBPSWL15,bakas2017advancing}, which contains $T_1$, $T_2$, and FLAIR. The sequence are co-registered to the same anatomical template, skull-stripped, and interpolated to the same resolution. We divide the selected 167 cases into a ratio of 140:10:17 for training, validation, and testing. From every case, we select the middle 60 slices that contain most of the anatomical details. Because BraTS does not provide raw k-space data, we follow common practices \cite{DBLP:conf/miccai/XiangCCZLWS18,DBLP:journals/tmi/YangYDSDYLAKGF18} to simulate k-space data. We refer to this dataset as ``simulated data". We implement the proposed approach using PyTorch and train all the models with Adam. {\it Below, our insights are first demonstrated with experiments on real data and are further validated on simulated data.}
optimization, a momentum of 0.5 and a learning rate of 0.0001, until they reach convergence. 

%In addition to simulated k-space data, we also use a set of private data from 20 patients, each of which has $T_1$, $T_2$, and FLAIR sequences. The sequences are co-registered and not skull-stripped. There are 18 slices per sequence for every patient. 

\subsection{Acquisition time and undersampling settings} \label{undersample assumptions}
%Optimal MR sampling strategy is, as previously mentioned, hard to find in general. The variables include finding reasonable sampling patterns, as well as knowing the different acquisition time $t_s$ of each sequence. 
In general, $T_2$ and FLAIR have a longer repetition time (TR) than $T_1$; however, the acquisition time of each sequence also depends on the number of excitations. A larger number of excitations helps better resolve sequences but take a longer time. Therefore, the acquisition time of each sequence is rather machine-dependent. Here we consider three experimental settings:  $t_{T_1}$:$t_{T_2}$:$t_{flair}$= (1) 1:1:1, (2) 1:4:6, and (3) 2:3:6. %Note that these two settings are of experimental purpose only; however we believe that the findings derived from them are applicable to other settings too.

We experiment on both low-pass sampling \cite{DBLP:conf/miccai/XiangCCZLWS18} and random sampling \cite{DBLP:journals/tmi/YangYDSDYLAKGF18}. We found that random sampling works better on real data but worse on simulated data. As our approach is agnostic of sampling strategy, we choose the better performing sampling strategy for each dataset. 
% We also fix our sampling pattern as low-pass sampling, where we always sample the middle, lower frequency k-space data first, determined by an undersampling factor $\lambda_s = \frac{N}{|\mathcal{M}_s|}$. While this aligns with the case in, we realize that there exists other sampling patterns. However, this is not the main focus of the paper and will be investigated in further research.
During BRM training, the masks $\mathcal{M}_{1:S}$ are generated based on a random $\lambda_s \in [1, k]$, where $k$ is the maximum undersampling factor (we set $k=8$). This means that BRM, after training, can handle a continuous set of undersampling factors on every sequence. 

\subsection{Evaluation metrics}
We utilize two metrics to gauge image quality: PSNR (peak signal-to-noise ratio) and SSIM (structural similarity). Since we mainly focus on three sequences, calculation of these metrics on three-sequence outputs is the same as on RGB images. This is easily extensible with a larger number of sequences. Since MRI images do not have a fixed dynamic range, PSNR values should be regarded for their relative improvements.% For example, a $T_2$ image tends to have a lower PSNR as it has the highest peak out of all three sequences.
 

\subsection{Main results}
\begin{figure}[t!]
\centering
\includegraphics[width=1\textwidth]{graph/ablation.png}
\caption{Quantitative recovery performance comparison. The Pearson correlation coefficient between Dedicated and MIMO vs between Dedicated and ZF is 0.85 vs -0.33 in the selected range }\label{fig:lambda_space}
\label{fig:ablation_1st}
\end{figure}
We evaluate the effectiveness of BRM to empirically prove that a properly trained network $f_{\theta}$ performs well regardless of the choices of $\mathcal{M}_{1:S}$, and serves as a good estimator of best sampling strategy. Furthermore, we show that MIMO BRM performs better than SISO BRM.
%Specifically, we want to show that 1. there is a strong correlation between the performances of dedicated model and BRM, and 2. MIMO BRM performs better than SISO BRM.

\input{lambda_space_graph/tune_table_new.tex}
\input{graph/graph_new.tex}
The study is done by training (i) a MIMO BRM, (ii) three SISO BRM, one for every sequences, and (iii) many models that are dedicated for specific sampling ratios. All the models follow the same structure shown in Fig. \ref{fig:MSR}. 
The proposed training scheme for continuous $\lambda_s \in [1,k]$ allows us to efficiently investigate the performance of different undersampling strategies. For each acquisition time setting $\{t_s\}_{s=1}^S$, we search through possible $\{\lambda_s\}^S_{s=1}$ on the following simplex:
%\begin{align}
$\sum_{s=1}^S \frac{t_s}{\lambda_s}= T_{max}$,
%\end{align}
which maximally utilizes the budgeted time $T_{max}$. We select hundreds of $\{\lambda_s\}^S_{s=1}$ under the 1:1:1 time setting, and set $T_{max} = \frac{T}{4}$, or $75\%$ reduction in time. We run the trained models on the test set, and plot the reconstruction performances in Fig. \ref{fig:ablation_1st}. The top-three performing sampling strategies for different acquisition time setting are shown in Table \ref{tab:Comparison}.

Fig. \ref{fig:ablation_1st} shows a clear performance gap between MIMO and SISO. Overall, the reconstruction performance of ZF images is positively correlated with the performances of BRMs; however, the correlation fluctuates often, and two sets of ZF that are similar in PSNR can swing for more than 1dB after the images are processed through BRM. To limit the number of dedicated models we need to train, we select a range of sampling factors of which ZF performance does not correlate well with MIMO/SISO performance, and train 30 dedicated models to see how well BRM predicts the performance of dedicated models. 
As we observe from the right image in Fig.~\ref{fig:ablation_1st}, our BRM, both from MIMO and SISO settings, predicts the performance of dedicated models with a high correlation. We further choose the best three $\{\lambda_s\}^S_{s=1}$, and perform the last stage of fine-tuning accordingly to (\ref{eq:step3}). A visual evaluation on real data is shown in Fig.~\ref{fig:vis}. For more visual results, please refer to the Supplemental Material section.

Base on the best performing $\{\lambda_s\}^S_{s=1}$, we perceive that among $T_1$, $T_2$, and FLAIR, the results are best when $T_2$ is sampled the most. We suggest that this makes intuitive sense as $T_2$ images provide the best contrast out of the three sequences, which can compensate for the details lost in other images. The same observation can be made on the simulated data, where both $T_2$ and FLAIR show good contrast. When the time setting is changed to non-uniformity, we can see that our search for the best sampling strategy reflects the change. $T_1$ is sampled more as a result of faster acquisition time, while $T_2$ is still sufficiently sampled.

\section{Conclusion}
In this work, we formulated multi-sequence MR recovery as a constrained optimization problem, and explored possible methods to solve such a problem. We proposed a CNN-based approach %that has been experimentally proven to be degradation-agnostic, 
and an optimization scheme that helps us find the proper combinations of sampling strategy and recovery model without combinatorial complexity. We evaluated our approach on both private raw data and public simulated data, demonstrating that our method can quickly finds the sampling strategy that yields superior reconstruction performance. We showed that our model outperforms single sequence recovery methods in terms of recovery quality, time and space complexity. We believe that our method, in combination with guidance from radiologists, can help reduce the acquisition time for multi-sequence scenarios. 

% \newpage

\bibliography{peng20}


% \appendix

% \newpage
% \input{graph/graph_part.tex}
% \input{graph/graph_extra_1.tex}
% \input{graph/graph_extra_2.tex}
\end{document}
