
\section{Background} \label{sec:background}

\subsection{Related Work}

\paragraph{Machine learning methods in skin cancer detection.}
Machine learning, in particular deep learning, has transformed automated skin lesion analysis. Esteva et al.\ showed that convolutional neural networks (CNNs) trained on large collections of clinical and dermoscopic images can reach dermatologist-level performance on lesion classification~\cite{esteva2017dermatologist}. Follow-up work fine-tuned ImageNet-pretrained CNNs such as ResNet on curated dermoscopy datasets~\cite{menegola2017knowledge}, examined robustness across cohorts~\cite{brinker2019convolutional}, and introduced benchmarks like HAM10000 that enabled large-scale evaluation and ensemble methods~\cite{tschandl2018ham10000,codella2018skin}. More recent studies have explored multimodal models that fuse dermoscopy with patient metadata for improved risk stratification~\cite{li2020integrating} and transformer-based architectures that capture long-range spatial dependencies~\cite{yuan2021tokens}. Despite this progress, challenges remain around data heterogeneity, interpretability, and deployment in real clinics~\cite{patel2021artificial}. In particular, most models focus on pixel and texture cues and underutilize global properties such as lesion shape, connectivity, and boundary irregularity that are central to dermoscopic diagnosis~\cite{gutman2016skin}.

\paragraph{Topological machine learning in medical image analysis.}
Topological data analysis (TDA) provides stable, multiscale descriptors of geometric and connectivity structure~\cite{carlsson2009topology}. Persistent homology (PH) has been applied in many biomedical settings, including modeling cell development~\cite{mcguirl2020topological}, delineating tumor margins~\cite{qaiser2019fast}, analyzing brain connectivity~\cite{bremer2018topological}, and extracting genomic signatures~\cite{lum2013extracting}; see Skaf et al.~\cite{skaf2022topological} for a survey. Building on these ideas, topological deep learning integrates PH summaries into trainable models~\cite{hofer2017deep,adams2017persistence}, with reported gains in segmentation~\cite{kahle2021topological,santhirasekaram2023topology} and classification~\cite{chacholski2019topological,johnson2022application}. Applications to melanoma and skin lesion analysis have begun to appear~\cite{maurya2024hybrid,chung2018topological}, but they typically use single-parameter filtrations and do not model interactions between multiple imaging cues.

Multiparameter persistent homology generalizes PH to filtrations indexed by more than one parameter and has been developed theoretically and algorithmically in recent years~\cite{botnan2022introduction,loiseaux2023stable,korkmaz2025cumperlay}. To our knowledge, it has not yet been explored for dermoscopic image analysis. Our work introduces cubical multiparameter persistence for skin cancer detection and studies both standalone topological models and hybrid models that fuse multipersistence summaries with Vision Transformers. This fills a gap between existing TDA-based approaches, which rely on single-parameter pipelines, and mainstream DL methods, which largely ignore explicit topological structure.


\vspace{-.1in}


\subsection{Cubical Persistence} \label{sec:PH}

Persistent homology (PH) is a core tool in TDA for extracting multiscale structure from data such as point clouds, networks, and images~\cite{dey2022computational}. We focus on its image variant, \emph{cubical persistence}. A brief overview follows; see \cite{coskunuzer2024topological} for details. PH proceeds in three steps:

\vspace{-.1in}


\begin{itemize}[noitemsep]
\item \textbf{Filtration}: build a nested sequence of topological spaces.
\item \textbf{Persistence diagrams}: record feature births and deaths across the sequence.
\item \textbf{Vectorization}: map diagrams to fixed-length representations for machine learning.
\end{itemize}

\begin{wrapfigure}{r}{3in}
\vspace{-.2in}
\centering
\includegraphics[width=\linewidth]{figures/filtration4.pdf}
\vspace{-.15in}
\caption{\scriptsize For the \(5\times 5\) image \(\X\) with the given pixel values, the sublevel filtration is the sequence of binary images \(\X_1\subset \dots\subset \X_5\).}
\label{fig:filtration}
\vspace{-.2in}
\end{wrapfigure}
\noindent \textit{Step 1. Constructing filtrations.} \quad
In images, filtrations are typically cubical. Starting from a grayscale (or single color-channel) image \(\X\in\mathbb{R}^{r\times s}\) with pixel values \(\gamma_{ij}\) and a sequence of thresholds
\(t_1<\dots<t_N\), we form a sublevel sequence \(\X_1\subset\cdots\subset\X_N\) with $\X_n=\{\Delta_{ij}\subset \X \mid \gamma_{ij}\le t_n\},$
where \(\Delta_{ij}\) is the pixel at position \((i,j)\).
Intuitively, pixels become “active’’ as the threshold increases, yielding a nested family of binary images (Fig.~\ref{fig:filtration}).


\noindent \textit{Step 2. Persistence diagrams.} \quad
PH tracks when topological features appear and disappear along \(\{\X_n\}\). If a feature \(\sigma\) is born at \(t_m\) and disappears at \(t_n\) with \(m<n\), the pair \((b_\sigma,d_\sigma)=(t_m,t_n)\) is added to the \(k\)-dimensional diagram \(\PD_k(\X)\), where \(k\) indexes connected components (\(k=0\)), holes (\(k=1\)), and higher-dimensional cavities. The lifespan \(d_\sigma-b_\sigma\) quantifies the prominence of the feature. In Fig.~\ref{fig:filtration}, \(\PD_0\) captures connected components and \(\PD_1\) captures holes.

\smallskip
\noindent \textit{Step 3. Vectorization.} \quad
Since persistence diagrams are multisets of birth–death pairs, they are not directly suitable for standard learning architectures. Vectorization~\cite{ali2023survey} converts diagrams into fixed-length representations. We primarily use \emph{Betti vectors}, where \(\beta_k(t_n)\) counts the number of alive \(k\)-dimensional features at threshold \(t_n\), yielding
\quad $\overrightarrow{\beta_k} = [\beta_k(t_1),\dots,\beta_k(t_N)].$\quad 
For example, for Fig.~\ref{fig:filtration}, \(\overrightarrow{\beta_0}=[4,2,1,1,1]\) and \(\overrightarrow{\beta_1}=[0,1,2,2,0]\).
Alternative encodings include persistence images~\cite{adams2017persistence}, landscapes~\cite{bubenik2017persistence}, silhouettes~\cite{chazal2014stochastic}, and kernel methods~\cite{ali2023survey}. We favor Betti vectors for their efficiency, interpretability, and sequence form, which extend to multiparameter Betti tensors and integrate well with transformer models.


\vspace{-.2in}
