\begin{figure}[!t]
    \centering
    \includegraphics[width=0.8\linewidth]{midl25_223_figures/block_diagram_morphler.jpeg}
    \vspace{-3mm}
    \caption{\small Proposed Architecture for \model}
    \label{architecture}
    \vspace{-6mm}
\end{figure}
\vspace{-7mm}
\section{Methods}
\input{midl25_223_sections/midl25_223_background}
\raggedbottom
\noindent \textbf{\model:}~The proposed model \model~is illustrated in Figure \ref{architecture} combines a primary network performing registration task and the secondary network functioning as a regularizer constrains the solution space of the primary task to ensure anatomically meaningful transformations. 

\vspace{0.05in}
\noindent \textit{Primary Registration Network}~can be any registration module that is designed to produce deformation fields. Given a pair of images A and B, the primary network is tasked with learning two displacement fields: \(\bphi_{AB}, \bphi_{BA}\) where \(\bphi_{AB}\) corresponds to the warp that ideally should match image A to image B, while \(\bphi_{BA}\) represents the inverse transformation from B to A. The network produces these displacement fields simultaneously to enable inverse consistency regularization. The displacement fields and their respective source images are passed through a spatial transform unit to produce registered images. The primary network incorporates inverse consistency regularization to ensure reliable bi-directional mappings: \(\bphi_{AB} \cdot \bphi_{BA} = id, \bphi_{BA} \cdot \bphi_{AB} = id \). This constraint encourages the network to learn transformations that are as close to being inverses of each other as possible, improving the overall accuracy of the registration process. 
The loss function for the primary network includes: (a) similarity loss: \(L_{sim}=SIM(A,\bphi_{AB}\circ B)+SIM(B,\bphi_{BA}\circ A)\) and (b) inverse consistency loss: \(L_{reg}=\|\bphi_{AB}\cdot \bphi_{BA}-id\|^2+\|\bphi_{BA}\cdot \bphi_{AB}-id\|^2\). The total loss is a weighted sum of these components where \(\lambda\) represents the weight for the term:
\begin{equation} \label{reg_loss}
L_{P}=\lambda_{sim} L_{sim}+\lambda_{reg} L_{reg}
\end{equation}

\vspace{0.05in}
\noindent \textit{Secondary Population-based Regularization Network}~uses Log Euclidean Diffeomorphic Autoencoder \cite{iyer2024leda}, which is strongly rooted in the Log-Euclidean statistics framework as the population-based regularizer. LEDA predicts \(N\) successive square roots of the deformation field, enabling accurate and computationally efficient logarithmic approximations using equation~\ref{nl-iss}. The encoder-decoder architecture is defined as:
\(
f_{\gamma}(\phi) = z, \quad g_{\theta}\left(\frac{z}{m}\right) = \phi^{1/m}, \) where \(m = 2^n, \, n \in \{0, 1, \dots, N\}\).
LEDA uses the following loss terms:

\vspace{0.05in}
\noindent \textit{Reconstruction Loss:} The deformation field must be accurately reconstructed from its predicted roots. If \(\bphi^{-m} \) is the predicted root at stage \(n\) where \(m = {2}^n\), the deformation field, when composed \(m\) times, must match the original deformation field, the reconstruction loss is \(\mathcal{L}_{rec} = \)
\begin{equation}
    \sum_k \sum_n \left\|C_m(\widehat{\bphi}_{AB}^{-m}) - \bphi_{AB}\right\|^2 + \left\|C_m(\widehat{\bphi}_{BA}^{-m}) - \bphi_{BA}\right\|^2 \text{ where } C_m(\widehat{\bphi}^{-m}) = \underbrace{\bphi \circ \dots \circ \bphi}_{m \text{ times}} \approx \bphi.
\end{equation}

\vspace{0.05in}
\noindent \textit{Inverse Consistency Losses:} Even though the primary network enforces inverse consistency, we also want the successive estimated roots of the forward and inverse deformation fields to compose to identity. This is enforced through the inverse consistency loss at each root approximation stage. Along with the deformation field inverse consistency, we also want the latent representations of forward and inverse transformations to be consistent, with equal magnitude but opposite directions: \(\z_{AB} = -\z_{BA}\). This is enforced through a latent inverse consistency loss that combines cosine similarity \(\Theta_k\) between \(\z_{AB}, \z_{BA}\)and magnitude constraints.
\begin{equation}
\mathcal{L}_{inv} = \sum_k \sum_n \left\| \widehat{\bphi}_{AB}^{-m} \circ \widehat{\bphi}_{BA}^{-m} - \mathbf{id} \right\|^2 \text{and } \mathcal{L}_{linv} = \sum_k \left(\frac{1 + \cos(\Theta_k)}{2} + \left\|\z_{AB} + \z_{BA}\right\|^2\right)
\end{equation}
The total loss function for LEDA is given by :
\begin{equation} \label{leda_loss}
\mathcal{L}_{S} = \alpha_{rec}\mathcal{L}_{rec} + \alpha_{inv}\mathcal{L}_{inv} + \alpha_{linv}\mathcal{L}_{linv}.
\end{equation}
The total loss function for training the proposed model is \(\mathcal{L}_{total} =  \mathcal{L}_{P} + \lambda_1\mathcal{L}_{S} \). The training process for \model~follows a two-phase strategy. Initially, we set \(\lambda_1 = 0\) (no LEDA regularizer), allowing the primary network to learn forward and backward diffeomorphic mappings. Subsequently, we activate the LEDA regularizer \(\lambda_1>1\), enabling simultaneous training of both primary and secondary networks. This approach ensures the primary network learns diffeomorphic transformations before introducing population-based regulation, leading to more accurate and anatomically consistent results.






