
\section{Implementation details}
\label{sec:appendixC}
\subsection{Initial segmentation}
The step by step approach to construct the initial segmentation of \textit{g-TAS} is as follows.
Raw images are normalised first by equally distributing 256 bins of pixel values, using the cumulative distribution of its histogram \cite{kim1997contrast, pizer1987adaptive}. To enhance the contrast between maximum bright spots, cell bodies, and background information in the image, the logarithm function is applied before the histogram equalisation procedure. This will relate areas closer to cell bodies and help determine more stable boundaries due to minimised effect of standalone maximum values. Next, the image values are projected to the range of $0$-$1$, and then normalised further to zero mean and unit variance. This pre-processing step is applied the same way to all images of any dataset and fed into the U-Net network to produce cell body predictions.

Thereafter, the Euclidean distance from the background is used as a successive threshold metric. What this entails is acquiring regions where the distance is above $0.5$, if $0$ indicates background information and $1$ is the farthest area from it. Thereafter, these regions are compared with the bigger regions cutoff at $0.1$. If there exists a region that was not identified by the stricter cutoff value of $0.5$, then the new region is considered to be a cell centroid. This is performed again for a cutoff value of $0$, which coincides with the exact U-Net prediction. The last step is necessary to detect small cells appearing in the frame, which the cutoff values $0.1$ and $0.5$ would ignore. An additional condition for a cell centroid is that is at least bigger than the third of the mean cell size detected at each step. This condition is necessary to reject small outlier pixels. The final cell centroid regions are used as seeds for the random walker approach to define the boundary of each smaller cell contained in the initial cell body prediction of the U-Net.

% Modified for R4 U-Net output
\subsection{U-Net prediction (post-)processing for TAS variants}
Using variants g-TAS and i-TAS the output of U-Net is processed the same for all datasets. First, centroids are constructed using the euclidean distance, of every pixel, to the background. Centroids are first defined by regions where the euclidean distance is above a threshold value of 0.5; where 0 indicates background and 1 the farthest foreground. This will produce small centroid regions and in some cases where cells are too small no centroids. To remedy this a threshold value of 0.1 is used to define bigger centroid regions which are combined with the smaller ones. If there exist overlapping centroids from the two sets, then smaller centroids are preferred to split up cells that are collided. This process is repeated another time for threshold value 0, to detect even the smallest cells exiting or entering the frame. To remove unwanted artefacts in the centroids set, regions below the third of the average centroid region size are removed. This results in distinct centroids for each cell in the frame. In order to define where the boundary of lightly collided cells is, the random walker algorithm is used, using as seeds the centroid regions calculated. The output of the random walker approach produces individual cells on a frame and the frame is then fed into the siamese tracker for re-segmentation of heavily convoluted cells and cell behaviour modelling.
Using the s-TAS variant, the same (post-)processing is applied on the U-Net output as in the work of \citet{lux2019dic}. Namely, cell centroids are constructed using manually defined ellipsoid kernels, pixel value thresholds and filling gaps in the cell bodies. Every dataset is processed with a different set of parameter values tailored to its cell type. Then the watershed algorithm is run using the original U-Net prediction and the constructed centroids to produce individual cells in the frame. This frame is then fed to the siamese tracker for re-segmentation.

% Modified for R4 data augmentation
\subsection{Data augmentation}
Based on \citet{ulman2017objective} we add (1) random additive noise, (2) pixel value range shifts, and (3) maximum value cutoffs. Specifically, for each image we create 15 augmented images, 5 images per augmentation technique: (1) We add to the input image random Gaussian noise $N(0, std)$, with five possible $std$ standard deviations {0.1, 0.325, 0.55,  0.775, 1}. (2) We add an offset to the input image pixel values, which is a random number drawn (per image) uniformly in the range of [-1, 1] . (3) We set the maximum or minimum value of an image to a cutoff value calculated as $cutoff = max * num$. $max$ is either the minimum or the maximum value of the image, chosen randomly. $num$ is a random number drawn from a uniform distribution in the range of [0, 1]. E.g.: An image with range [-3, 3] and $max = -3$ and $num = 0.5$, will result in an image of range [-1.5, 3] with any value smaller than -1.5 being set to -1.5. 

% Modified for R1 Computational complexity
\subsection{Computational complexity}
Our end-to-end model, U-Net and SiamFC combined, comprises 25 convolutional layers. In comparison, \citet{lux2019dic} use a U-Net with 20 convolutional layers. And, \citet{zhou2019joint} use two U-Net Networks for a total of 40 convolutional layers. Thus, our model sits between the two competitor methods. Generally, we needed around 5-10 minutes per sequence comprising 100 to 400 frames on a Geforce TitanX gpu.

\section{Additional Illustrations}
\label{sec:appendixB}

\begin{figure}[H]
\centering
\begin{tikzpicture}
% \node[inner sep=0pt] (russell) at (0,0){
%     \includegraphics[width=0.15\linewidth]{images/re_seg_1_1.png}\hfill
%     \includegraphics[width=0.15\linewidth]{images/re_seg_1_2_unet.png}\hfill
%     \includegraphics[width=0.15\linewidth]{images/re_seg_1_3_unet_watershed.png}\hfill
%     \includegraphics[width=0.15\linewidth]{images/re_seg_1_4_ours.png}\hfill
%     \includegraphics[width=0.15\linewidth]{images/re_seg_1_5_true.png}\hfill};
\node[inner sep=0pt] (russell) at (0,0){
    \includegraphics[width=0.15\linewidth]{images/re_seg_2_1.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_2_2_unet.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_2_3_unet_watershed.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_2_4_ours.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_2_5_true.png}\hfill};
\node[inner sep=0pt] (russell) at (0,-2){
    \includegraphics[width=0.15\linewidth]{images/re_seg_3_1.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_3_2_unet.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_3_3_unet_watershed.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_3_4_ours.png}\hfill
    \includegraphics[width=0.15\linewidth]{images/re_seg_3_5_true.png}};
    
\node at (-6.3,-0){\normalsize \textbf{t}};
\node at (-6.3,-2){\normalsize \textbf{t+1}};
% \node at (-6.3,-0){\normalsize \textbf{t-1}};
\node at (-4.6,1.5){\footnotesize \scalebox{0.9}[1.0]{\textbf{Actual Images}}};
\node at (-2.4,1.5){\footnotesize \scalebox{1}[1.0]{\textbf{U-Net}}};
\node at (0,1.45){\footnotesize \scalebox{0.85}[1.0]{\textbf{Watershed seg.}}};
\node at (2.3,1.5){\footnotesize \scalebox{1}[1.0]{\textbf{Ours}}};
\node at (4.6,1.5){\footnotesize \textbf{True Labels}};
\end{tikzpicture}
  \caption{Cell segmentation illustration by (1) plain U-Net, (2) U-Net and watershed (3) Our \textit{s-TAS} method and (4) the true labels.}
  \label{fig:re-segmentation}
\end{figure}


\begin{figure}[H]
    \begin{minipage}[b]{0.48\linewidth}
    \centering
    \begin{tikzpicture}
\node[inner sep=0pt] (img2) at (0,0){\includegraphics[width=0.4\linewidth]{images/mitosis_DIC-C2DH-HeLa_1_2.png}
    \includegraphics[width=0.4\linewidth]{images/mitosis_DIC-C2DH-HeLa_2_2.png}};
    \draw [color=red, very thick] (-2.28,-1.25) rectangle (-1.5, -0.5);
    \draw [color=red, very thick] (0.78,-1.25) rectangle (1.7, -0.5);
    \end{tikzpicture}
    % \vspace{-0.5em}
    \caption*{Mitotic event in DIC-C2DH-HeLa dataset}
  \end{minipage}\hfill
  \begin{minipage}[b]{0.48\linewidth}
    \centering
    \begin{tikzpicture}
\node[inner sep=0pt] (img3) at (0,0){    \includegraphics[width=0.4\linewidth]{images/collission_Fluo-N2DH-SIM_0_edited.png}
    \includegraphics[width=0.4\linewidth]{images/collission_Fluo-N2DH-SIM_1_edited.png}};
    \draw [color=red, very thick] (-2.95,1.25) rectangle (-0.5, 0.05);
    \draw [color=red, very thick] (0.3,1.25) rectangle (2.4, 0.05);
    \end{tikzpicture}
    % \vspace{-0.5em}
    \caption*{Collision event in Fluo-N2DH-SIM\plussmall~dataset}
  \end{minipage}
%   \vspace{-0.5em}
  \caption{Biological cell movement behaviours between subsequent frames.}
  \label{fig:mitosis_collission_example}
\end{figure}

% Modified for R4 Figure 1 grayscale
\begin{figure}[H]
    \centering
    \includegraphics[height=2.55cm]{images/gray_real_img_Fluo-N2DL-HeLa.png}\hfill
    \includegraphics[height=2.55cm]{images/gray_real_img_Fluo-N2DH-SIM+.png}\hfill
    \includegraphics[height=2.55cm]{images/gray_real_img_Fluo-N2DH-GOWT1.png}\hfill
    \includegraphics[height=2.55cm]{images/gray_real_img_PhC-C2DL-PSC.png}\hfill
    \includegraphics[height=2.55cm]{images/gray_real_img_DIC-C2DH-HeLa.png}
    \caption{Example images of (a) Fluo-N2DL-HeLa (b) Fluo-N2DH-SIM\plussmall~(c) Fluo-N2DH-GOWT1 (d) PhC-C2DL-PSC (e) DIC-C2DH-HeLa datasets, as in Figure \ref{fig:real_imgs} in grayscale color map}
    \label{fig:gray_real_imgs}
\end{figure}

\section{Additional Results}
\label{sec:appendixA}
\begin{table}[H] 
    \centering
    \begin{tabular}{p{4cm} | c c c c c |}
      Competition rank & \multicolumn{5}{|c|}{{DIC-C2DH-HeLa}}\\
       & DET & SEG & TRA & OP$_{CSB}$& OP$_{CTB}$\\\hline
     3rd entry &
     \textcolor{Apricot}{0.948} & \textcolor{Apricot}{0.820} & \textcolor{Apricot}{0.909} & \textcolor{Apricot}{0.884} & \textcolor{Apricot}{0.848} \\
     2nd entry & 
     \textcolor{CarnationPink}{0.979} & \textcolor{BrickRed}{0.807} & \textcolor{Goldenrod}{0.969} & \textcolor{Apricot}{0.887} & \textcolor{Apricot}{0.882} \\
     1st entry & 
     \textcolor{YellowOrange}{0.960} & \textcolor{Violet}{0.665} & \textcolor{YellowOrange}{0.950} & \textcolor{Violet}{0.808} & \textcolor{BrickRed}{0.804} \\
     \textit{s-TAS} & 0.958 & 0.852 & \textbf{0.955} & 0.905 & 0.904 \\\hline
     
     & \multicolumn{5}{|c|}{{Fluo-N2DH-SIM}\plussmall}\\
      & DET & SEG & TRA & OP$_{CSB}$& OP$_{CTB}$\\\hline
     3rd entry &
     \textcolor{Cyan}{0.956} & \textcolor{Cyan}{0.834} & \textcolor{Cyan}{0.954} & \textcolor{Cyan}{0.895} & \textcolor{Cyan}{0.894} \\
     2nd entry &
     \textcolor{black}{0.981} & \textcolor{GreenYellow}{0.813} & \textcolor{Cyan}{0.973} & \textcolor{GreenYellow}{0.890} & \textcolor{GreenYellow}{0.889} \\
     1st entry &
     \textcolor{black}{0.984} & \textcolor{BrickRed}{0.682} & \textcolor{Goldenrod}{0.957} & \textcolor{BrickRed}{0.809} & \textcolor{Violet}{0.804}\\
     \textit{s-TAS} & 0.972 & \textbf{0.822} & 0.971 & \textbf{0.897} & \textbf{0.896} \\\hline
     
      & \multicolumn{5}{|c|}{{PhC-C2DL-PSC}}\\
      & DET & SEG & TRA & OP$_{CSB}$& OP$_{CTB}$\\\hline
     3rd entry &
     \textcolor{Goldenrod}{0.961} & \textcolor{Goldenrod}{0.863} & \textcolor{Goldenrod}{0.954} & \textcolor{Goldenrod}{\textbf{0.912}} & \textcolor{Goldenrod}{\textbf{0.909}} \\
     2nd entry &
     \textcolor{Apricot}{0.983} & \textcolor{Goldenrod}{0.821} & \textcolor{black}{0.975} & \textcolor{Goldenrod}{0.896} & \textcolor{Goldenrod}{0.895} \\
     1st entry &
     \textcolor{Goldenrod}{0.967} & \textcolor{Goldenrod}{0.715} & \textcolor{teal}{0.959} & \textcolor{Goldenrod}{0.841} & \textcolor{Goldenrod}{0.836} \\
     \textit{s-TAS} & \textbf{0.972} & \textbf{0.720} & \textbf{0.966} & \textbf{0.846} & \textbf{0.843} \\
     
    \end{tabular}
    \caption{Detailed performance comparison of our approach with the top 3 performers on the leaderboard of the cell tracking challenge, as of 30$^{\text{th}}$ of January. The different color codes correspond to different teams, namely \textcolor{Goldenrod}{MU-Lux-CZ} (as in \citet{lux2019dic}), \textcolor{Cyan}{ND-US}, \textcolor{Apricot}{BGU-IL} (as in \citet{zhou2019joint}), \textcolor{BrickRed}{CVUT-CZ}, \textcolor{Violet}{HD-Hau-GE}, \textcolor{GreenYellow}{UVA-NL}, \textcolor{YellowOrange}{HIT-CN}, \textcolor{black}{FR-Ro-GE}, \textcolor{teal}{KTH-SE} and  \textcolor{CarnationPink}{TUG-AT}.
    }
    \label{tab:results21}
\end{table}


