\appendix
\section{Additional Details of Dataset}\label{sec:sup_data}
We evaluate all methods on three public spatial transcriptomics (ST) datasets: HER2~\cite{andersson2021spatial}, Breast Cancer~\cite{he2020integrating}, and Kidney~\cite{lake2023atlas}. The HER2 dataset comprises 8 patient samples with 36 WSIs and 13,620 spatial spots in total. The Breast Cancer dataset consists of 23 samples with 68 WSIs and 30,066 spots. The Kidney dataset includes 22 samples with 23 WSIs and 25,944 spots. The spot diameter for HER2 and Breast Cancer is 100~$\mu$m, whereas the Kidney dataset adopts a smaller 55~$\mu$m spot diameter. For the external single-cell datasets, we use two million cells from~\cite{lake2025cellular} as the reference for the Kidney dataset, and around three million cells from~\cite{chen2025highly, reed2024single, klughammer2024multi} as the reference for the Breast Cancer and HER2 datasets.

For each spot location, we extracted a $224 \times 224$ pixel histology patch centered on its spatial coordinate as model input. To construct the prediction targets, we selected the top 300 genes with the highest variance in expression within each dataset. Following BLEEP~\cite{xie2024spatially}, we applied a $\log(1+x)$ transformation to the raw count matrices to alleviate the heavy-tailed distribution characteristic of ST expression data~\cite{he2020integrating}. The dataset-specific selected genes are visualized in Appendix Figure~\ref{fig:data_profile}.

\begin{figure*}[!ht]
    \centering
    \includegraphics[width=\textwidth]{Figure_pdf/Figure_sup_profile.png}
    \caption{\textbf{Selected high variance genes for gene expression prediction task across datasets.}
    }
    \label{fig:data_profile}
\end{figure*}

\begin{figure*}[!htb]
    \centering
    \includegraphics[width=\textwidth]{Figure_pdf/Figure_sup_ratio.pdf}
    \caption{\textbf{Comparison of sampling strategies across multiple methods on different datasets.} We evaluate performance under training data ratios of 10\%–75\% to simulate low-resource scenarios, with each curve denoting a sampling strategy. Our proposed SCRL sampling consistently achieves better performance across datasets and models, particularly in low-budget (10\%–25\%) regimes.
    }
    \label{fig:sampling_comparison_sup}
\end{figure*}

\section{Additional Implementation Details}
\label{sec:sup_implementation}

To evaluate the effectiveness of our sampling strategy, we compare it against three representative baselines: Monte Carlo random sampling, uncertainty-based sampling~\cite{safaei2024entropic}, and diversity-driven sampling~\cite{zhdanov2019diverse}.


\noindent \textbf{Uncertainty-based sampling.} This method estimates prediction uncertainty via Monte Carlo Dropout. Specifically, we insert a Dropout layer (drop rate = 0.1) after the feature extraction block of the vision encoder, and the model outputs the expression values of 300 genes. During uncertainty estimation, the Dropout layer remains activated, and we perform $T = 20$ stochastic forward passes for each patch. We compute the variance across these predictions and use its mean as the entropy score. Higher entropy indicates greater model uncertainty, and patches with high entropy are prioritized during sampling.



\begin{table*}[thbp]
\centering
\scriptsize
\renewcommand{\arraystretch}{1}
\caption{\textbf{Summary of computational cost.} We summarize the
parameter size, running time per epoch, GPU memory consumption, and number of training samples of our model on Breast Cancer dataset.}
\begin{tabular}{
    p{0.18\linewidth}  % Component
    P{0.14\linewidth}  % Parameter
    P{0.18\linewidth}  % Running time
    P{0.14\linewidth}  % GPU Memory
    P{0.14\linewidth}  % Samples
}
\toprule
\textbf{Module} & \textbf{Parameter} & \textbf{Time/epoch} & \textbf{GPU Mem} & \textbf{Samples} \\
\midrule
SCR$^2$Net & 33.82 M & 85.70 s & 35.13 GB & 27{,}171 \\
SCRL      & 0.6 M   & 50.54 s & --       & 3{,}099{,}206 \\
\bottomrule
\end{tabular}
\label{tab:complexity_comparison}
\end{table*}


\begin{figure*}[!htb]
    \centering
    \includegraphics[width=\textwidth]{Figure_pdf/Figure_abl_sampling.pdf}
    \caption{\textbf{Ablation study of reward function in our proposed SCRL sampling. }
    }
    \label{fig:abl_sampling}
\end{figure*}



\noindent \textbf{Diversity-driven sampling.} This method encourages sample diversity based on feature similarity. We first extract 1024-dimensional visual features using the vision encoder during training. These features are standardized and reduced to 128 dimensions via PCA, followed by clustering using DBSCAN. To avoid insufficient cluster granularity, we incorporate a dynamic adjustment mechanism: the minimum cluster count is adaptively set to $\sqrt{N}/5$, where $N$ denotes the total number of samples. If DBSCAN yields too few clusters, we automatically switch to KMeans to enforce the desired number of clusters. During sampling, patches are drawn uniformly from each cluster to ensure diverse coverage of tissue regions.


\begin{figure*}[htbp]
    \centering
    \includegraphics[width=0.8\textwidth]{Figure_pdf/Figure_sup_sampling_visual.pdf}
    \caption{\textbf{Spatial distribution of sampled spots under 10\% sampling.}
Red circles indicate selected spots on whole-slide images. Entropy-based sampling produces concentrated regions, while diversity-based sampling wastes samples in low-density background areas.
Our method (SCRL) achieves more informative and balanced sampling by incorporating spatial and biological cues.}
    \label{fig:visual_sample}
\end{figure*}

\noindent \textbf{Our SCRL sampling strategy. }For the multi-objective reward function, we set $w_{\mathrm{sc}} = 20$, $w_{\mathrm{type}} = 5$, and $w_{\mathrm{spa}} = 0.05$ to balance manifold coverage, cell-type diversity, and spatial density constraints. The loss weights $\lambda_{r}$, $\lambda_{p}$, and $\lambda_{KD}$ are set to 1.0, 0.25, and 0.25, respectively. We adopt a similarity confidence mask with threshold $m = 0.15$. Active sampling proceeds for 20 rounds, with the sampled training set updated every 5 epochs. The initial round employs random sampling for warm-up. A fixed seed of 42 is used for reproducibility. During retrieval, we select the top-50 most similar expression profiles and retain only the top-10 dominant cell types for robust reference aggregation.


\begin{figure*}[htbp]
    \centering
\includegraphics[width=0.935\linewidth]{Figure_pdf/Figure_gene_rps3.pdf}
    \caption{Visualization of predicted spatial expression distribution of cancer-related gene RPS3 on WSIs.}
    \label{fig:sup_rps}
\end{figure*}


\section{Additional Experimental Results} \label{sec:sup_results}
Due to space limitations in the main manuscript, we provide additional experimental results in this appendix, including the qualitative analysis of low-budget sampling (Figure~\ref{fig:visual_sample}), the comparison of sampling strategies evaluated on mlSTExp and TRIPLEX benchmarks (Figure~\ref{fig:sampling_comparison_sup}), and an ablation study on the reward function design (Figure~\ref{fig:abl_sampling}).

\begin{figure*}[htbp]
    \centering
\includegraphics[width=0.935\linewidth]{Figure_pdf/Figure_gene_rpl19.pdf}
    \caption{Visualization of predicted spatial expression distribution of cancer-related gene RPL19 on WSIs.}
    \label{fig:sup_rpl}
\end{figure*}

\noindent \textbf{Qualitative analysis of low-budget sampling.}
Figure~\ref{fig:abl_sampling} further provides a qualitative comparison of different sampling strategies by visualizing the spatial distribution of sampled spots on whole-slide images under a low sampling budget (10\%). As shown in the figure, entropy-based sampling tends to select spots from relatively concentrated regions, resulting in limited spatial coverage of the tissue. In contrast, diversity-based sampling distributes samples more broadly, but often allocates a substantial portion of the limited sampling budget to low tissue-density or background regions, leading to inefficient sampling. By jointly incorporating spatial relationships and biological information, our method achieves a more balanced sampling pattern that focuses on informative tissue regions while maintaining adequate spatial coverage. This behavior explains the superior performance of our approach under low-budget sampling settings, where effective utilization of limited sampling resources is critical.

\noindent \textbf{Cancer-related Gene Expression Prediction.} We further selected two genes that are highly relevant to breast cancer, RPL19 and RPS3. RPL19 is frequently overexpressed in breast cancer tissues, making it a commonly used reference and prognostic-related gene in breast cancer studies~\citep{hong2014ribosomal}. RPS3's dysregulation is associated with enhanced tumor aggressiveness and poor prognosis in breast cancer~\citep{ono2017ribosomal}. We visualize the predicted spatial expression patterns of these two genes on whole-slide images in Figure~\ref{fig:sup_rpl} and Figure~\ref{fig:sup_rps}, demonstrating that the proposed model captures spatially coherent and biologically meaningful expression distributions, thereby highlighting its interpretability at clinically relevant molecular levels.

\section{Practical Considerations and Future Directions}
\label{sec:future_work}
In realistic scenarios, the proposed SCRL sampler can start from any amount of existing spatial transcriptomics (ST) data and does not require sampling tissues at full coverage. The sampling process is formulated as a sequential decision procedure, in which the sampler is optimized online based on the information observed so far, and adaptively selects tissue locations to query until the predefined sampling budget is reached.

A pretrained sampler can be directly transferred to unseen data from a new laboratory for sampling, and does not require collecting any amount of ST data as a start point on the new dataset. This allows the sampler to be applied without introducing additional experimental overhead at the initial stage. As more data become available in future studies, the sampler can be further optimized, which is expected to improve its robustness and transferability across different experimental settings.