\section{Introduction}
\label{sec:introduction}
% The most common treatment for infertility is in vitro fertilization (IVF). Due to the inherent risks of multiple pregnancies~\cite{AndreaFanelli2012ExtractionOF}, it is critical to select a high quality embryo for single-embryo transfer~\cite{ParvanehSaeedi2017AutomaticIO}, in order to produce a healthy baby. Typically, embryo transfer (ET) includes \eat{day-3 (D3)}cleavage ET and \eat{day-5 (D5)}blastocyst ET, corresponding to the cleavage stage ET and blastocyst stage ET, respectively. Studies have shown that blastocyst stage ET is of great help to improve the implantation rate~\cite{EvangelosGPapanikolaou2005LiveBR}. 
% % 
% Thus, in clinical practice, embryologists often manually analyze multiple embryo images at the blastocyst stage to select blastocysts with high implantation rates.
% However, this process is cumbersome and leads to high variance~\cite{LindaSundvall2013InterAI,AshleighStorr2017InterobserverAI}. To help embryologists effectively evaluate blastocyst quality and accurately predict implantation outcomes, it is highly desirable to develop automatic computer-aided methods for analyzing embryo images.
% However, the implantation rate of a blastocyst is highly correlated with other factors such as the mother's physical condition and the standardization of the implantation procedure. Therefore, predicting the success of implantation based only on blastocyst images is an ill-posed problem and poses significant challenges.

In vitro fertilization (IVF) is the most prevalent treatment for infertility. 
Due to the inherent risks of multiple pregnancies~\cite{AndreaFanelli2012ExtractionOF}, it is critical to select high quality embryo for single-embryo transfer, to produce one healthy baby.
% Selecting a high-quality embryo for single-embryo transfer is crucial to ensure a healthy offspring, given the risks associated with multiple pregnancies~\cite{AndreaFanelli2012ExtractionOF}. 
Typically, Embryo transfer (ET) involves cleavage stage ET and blastocyst stage ET.
% Current study~\cite{EvangelosGPapanikolaou2005LiveBR} indicates that blastocyst stage ET significantly enhances implantation rates. 
According to the recent finding~\cite{EvangelosGPapanikolaou2005LiveBR}, the blastocyst stage ET significantly enhances implantation rates.
Thus, in clinical practice, embryologists often manually analyze multiple blastocyst stage embryo images to identify those with the highest likelihood of successful implantation. However, this manual analysis is laborious and subject to considerable variability~\cite{LindaSundvall2013InterAI,AshleighStorr2017InterobserverAI}. To help embryologists effectively evaluate blastocyst quality and accurately predict implantation outcomes, it is highly desirable to develop automatic computer-aided methods for analyzing embryo images.
% Thus, there is a pressing need for automated computer-aided methods to assess blastocyst quality and predict implantation outcomes more accurately. Nonetheless, the implantation success of a blastocyst is also influenced by factors beyond the embryo's characteristics, such as the mother's health and the standardization of the implantation process. Therefore, relying solely on blastocyst images for predicting implantation success presents a complex challenge.

\begin{figure}[t!]
    \centering
    % \hspace{-3.75ex}
    \subfigure[positive example]{
         \centering
         \includegraphics[width=0.4\textwidth]{IMG/Positive.pdf}
         \label{fig:P}
    }
    \subfigure[negative example]{
         \centering
         \includegraphics[width=0.4\textwidth]{IMG/Negative.pdf}
         \label{fig:N}
    }
    \vspace{-2ex}
    \caption{Examples of microscopic images at different focal planes of blastocysts.}
    \label{fig:example}
    \vspace{-2ex}
\end{figure}
% \begin{figure}[t!]
%     \centering
%     \includegraphics[width=0.95\linewidth]{IMG/sample}
%     \caption{Examples of microscopic images at different focal planes for a blastocyst. The images in the first row are for a blastocyst that has been successfully implanted, while the images in the second row are for a blastocyst that has failed.}
%     \label{fig:example}
% \end{figure}

% \begin{figure}
%     \centering
%     \subfigure[Positive: stage]{
%     \includegraphics[width=0.13\linewidth]{IMG/a1.png}
%     \label{fig:example:a1}
%     }
%     \subfigure[Positive: ICM]{
%     \includegraphics[width=0.13\linewidth]{IMG/a2.png}
%     \label{fig:example:a2}
%     }
%     \subfigure[Positive: TE]{
%     \includegraphics[width=0.13\linewidth]{IMG/a3.png}
%     \label{fig:example:a3}
%     }
%     \subfigure[Negative: stage]{
%     \includegraphics[width=0.13\linewidth]{IMG/b1.png}
%     \label{fig:example:b1}
%     }
%     \subfigure[Negeative: ICM]{
%     \includegraphics[width=0.13\linewidth]{IMG/b2.png}
%     \label{fig:example:b2}
%     }
%     \subfigure[Negeative: TE]{
%     \includegraphics[width=0.13\linewidth]{IMG/b3.png}
%     \label{fig:example:b3}
%     }
%     \caption{Examples of microscopic images at different focal planes for a blastocyst. The images in the first row are for a blastocyst that has been successfully implanted, while the images in the second row are for a blastocyst that has failed.}
%     \label{fig:example}
% \end{figure}
 % The positive blastocyst is graded as 5AB and the Negative one is graded as 3BC

Recent researches in computer-aided diagnosis (CAD) for embryo analysis mainly focus on three key tasks: stage classification~\cite{Aisha2016,Stanislav2021,Lisette2021}, blastocyst segmentation,~\cite{YousufHarun2019ImageSO,Reza2020} and blastocyst grading~\cite{PegahKhosravi2019DeepLE}. While stage classification and blastocyst segmentation are crucial preliminary steps in embryo analysis, they do not directly predict implantation outcomes. Current blastocyst grading methods~\cite{PegahKhosravi2019DeepLE} evaluated implantation rates by categorizing a single microscopic image into various grades. However, this approach struggles to accurately represent the three-dimensional nature of embryos, particularly the inner cell mass (ICM) and trophectoderm (TE), in a single image. 
Clinically, embryologists evaluate the stage, inner cell mass (ICM), and trophectoderm (TE) of a blastocyst independently to derive a comprehensive score indicative of its transfer potential. The stage is determined by the blastocyst's developmental stage and its interaction with the zona pellucida (ZP), while ICM and TE refer to specific cellular components of the blastocyst. As depicted in Fig.~\ref{fig:example}, 'stage' images show the blastocyst's breakthrough of the ZP while 'ICM' and 'TE' images highlight specific areas of the blastocyst. However, capturing these features distinctly in a single image is challenging. Therefore, developing an image-fusion technique for accurate prediction of blastocyst implantation outcomes is imperative.

% In the clinical diagnosis process, embryologists grade the situations of stage, ICM, and TE individually to obtain an overall score to determine whether the blastocyst has transfer potential (i.e., the success of implantation after ET). Stage refers to the current developmental stage of the blastocyst, which is commonly determined by the size of the cavity in the middle of the blastocyst and whether it breaks through the zona pellucida (ZP) (as shown in Fig.~\ref{fig:example:a1}). ICM is a mass of cells inside the blastocyst (e.g., see the focused parts in Fig.~\ref{fig:example:a2} and Fig.~\ref{fig:example:b2}). TE is the outer layer of cells that wraps the ICM, providing nutrients to the blastocyst. Overall, as shown in Fig.~\ref{fig:example}, `stage' images present whether the blastocyst breaks through the zona pellucida, while `ICM' and `TE' images focus on the inner cell mass and trophectoderm at the blastocyst edge, but any two of them cannot be captured clearly in a single image. Thus, it is important to attain an image-fusion based method for blastocyst implantation outcome prediction.


Currently, joint analysis of multiple focal-plane (FP) images of embryos is still in its infancy. Zeman et al.~\cite{AstridZeman2021DeepLF} chose three FP-images and concatenated them directly to predict embryo quality, treating the three FP-images as equally important. However, embryonic information contained in different FP-images is different, and treating them as equally important may make it difficult to fully exploit the features captured by different focal planes. 
% Known multi-modal fusion methods can be classified into early-~\cite{AstridZeman2021DeepLF}, mid-~\cite{Arsha2021}, late-~\cite{SuPang2020CLOCsCO}, and hybrid-~\cite{TaoZhou2020HiNetHN} fusion types, but none of these methods considered the specific information or key information of each modality. 
Worse, known multi-modal fusion methods, no matter early-, mid-, late-, and hybrid-fusion types~\cite{AstridZeman2021DeepLF,Arsha2021,SuPang2020CLOCsCO,TaoZhou2020HiNetHN}, neglect extraction of the specific information or key information (e.g., ICM area in Fig.~\ref{fig:N} of each modality), which may have strong correlation with the final result.
Moreover, most known fusion methods utilize two modalities, which are relatively easy to fuse. 
% Compared with the above methods, our triple-FP-image task is more challenging, and datasets of more modalities are quite common in the medical imaging field. Therefore, it is important to develop fusion methods with more modalities.
% Our implantation outcome prediction of blastocyst is based on triple FP-image analysis, and thus is more challenging. Hence, we need to develop fusion methods with more modalities.
However, the challenge in predicting blastocyst implantation outcomes involves the analysis of three FP images with different key information, necessitating the development of more effective multi-modal fusion techniques.

% The field of analyzing multiple focal-plane (FP) images of embryos is still emerging. Zeman et al.~\cite{AstridZeman2021DeepLF} concatenates three FP images to assess embryo quality, while treating each image with equal importance, overlooks the distinct information each FP image offers. This equality in treatment potentially hinders the full utilization of the diverse features captured by various focal planes. Furthermore, current multi-modal fusion techniques~\cite{AstridZeman2021DeepLF,Arsha2021,SuPang2020CLOCsCO,TaoZhou2020HiNetHN} typically neglect extraction of crucial information from each modality and are limited to two modalities. However, the challenge in predicting blastocyst implantation outcomes involves the analysis of three FP images with different key information, necessitating the development of more effective multi-modal fusion techniques.

% To address the above challenges, 
To this end, we propose a novel Multiple Focal-plane Image Fusion Network (\model), which utilizes three FP-images of a blastocyst as input and predicts implantation outcomes. Specifically, \model consists of two sub-networks: the Core Image Generator (CI-Gen) and the Key Feature Fusion Network (KFFNet). In CI-Gen, since the three FP-images focus on different positions, we first fuse the three FP-images to generate a `clear' {\it core image} by pixel-wise weighting. However, information loss will occur in the core image generation process since there are overlaps among the three FP-images. Therefore, in KFFNet, to further utilize key information in each FP-image, we propose a Fusion Layer to capture key features by a Fusion Module in each focal plane, and fuse them with the core image features. Note that in the Fusion Module, we apply spatial-channel separated Squeeze Multi-Headed Attention (SMHA) blocks for efficient information exchange and feature enhancement. In summary, we achieve feature fusion of three focal-plane images at each stage through the core image and Fusion Module, effectively reducing redundancy and better integrating essential information.
% Finally, we constructed a dataset from our partner hospital including the annotated blastocyst data (each blastocyst contains 3 FP-images) to validate our method. 
%We conduct extensive experiments on our dataset to empirically validate the superior performance of our \model compared to state-of-the-art methods across various metrics. 

% The main contributions are summarised as follows.
% \begin{itemize}
%     %We investigate implantation outcome prediction of blastocyst from the multiple FP-image fusion perspectives, which is under explored in previous work. 
%    \item We propose a novel Multiple Focal-plane Image Fusion Network for implantation outcome prediction of blastocyst. This network uniquely integrates key information from the multiple FP-image fusion perspective, which is under-explored in prior work.
%    % at various stages, utilizing the Core Image Generator for early-stage fusion, and the Key Feature Fusion Network for mid and late stages.
%    %We propose a novel Multiple Focal-plane Image Fusion Network, performing fusion on multiple focal-plane images at early, middle, and late stages, to achieve better prediction performance.
%    \item We design a new plug-and-play feature interaction block tailored for facilitating information exchange and mitigating computational intensity in attention mechanisms, to address the limitation of current methods in failing to extract key information from various locations in FP images.
%    \item We conduct extensive experiments to demonstrate the superior performance of our \model over state-of-the-art methods in various metrics, and validate the rationality of each component in \model through sufficient ablation studies.
% \end{itemize}

\textbf{Contributions.} 1) We propose a novel Multiple Focal-plane Image Fusion Network for implantation outcome prediction of blastocyst. This network uniquely integrates key information from the multiple FP-image fusion perspective, which is under-explored in prior work. 2) We design a new plug-and-play feature interaction block tailored for facilitating information exchange and mitigating computational intensity in attention mechanisms, to address the limitation of current methods in failing to extract key information from various locations in FP images. 3) We conduct extensive experiments to demonstrate the superior performance of our \model over state-of-the-art methods in various metrics, and validate the rationality of each component in \model through sufficient ablation studies.


% in the blastocyst implantation prediction task. 

% In the rest of the paper, we will first introduce our method and internal details, then describe our experimental details and results, and finally we will summarize our method.



% \begin{figure*}[t]
%     \centering
%     \includegraphics[width=12cm]{IMG/overview_0816}
%     \caption{An overview of our proposed \model. \textcircled{\small{s}} denotes a channel-reduced convolutional layer or an average pooling layer. Entities in the legend of this figure also denote the same meanings in the subsequent figures.}
%     \label{model}
% \end{figure*}