
\begin{figure*}[thbp]
    \centering
    \includegraphics[width=\textwidth]{Figure_pdf/Figure3_ratio.pdf}
    \caption{\textbf{Comparison of sampling strategies across multiple methods with metrics reported (A) MSE and (B) PCC.} We evaluate performance under data ratios of 10\%–75\% to simulate low-resource scenarios, with each curve denoting a sampling strategy. Our SCRL sampling consistently achieves better performance across datasets and models, particularly in low-budget (10\%–25\%) regimes.
    }
    \label{fig:sampling_comparison}
\end{figure*}
\section{Results}

\subsection{Active Sampling under Budget Constraints}
To validate the effectiveness of our active sampling strategy via single-cell guided reinforcement learning (SCRL), we conducted a systematic evaluation across methods. As shown in Figure~\ref{fig:sampling_comparison} and Figure~\ref{fig:sampling_comparison_sup} in Appendix~\ref{sec:sup_results}, we compared four sampling strategies under training data ratios ranging from 10\% to 75\%. Experimental results demonstrate that SCRL sampling achieves optimal performance across all datasets and model combinations, with particularly advantages in low-budget scenarios (10\%–25\%). For Breast Cancer dataset at a 10\% sampling ratio, SCRL sampling reduces the MSE from approximately 0.85 to 0.75 and improves the PCC from 0.04 to 0.14 on ST-Net compared to random sampling. This trend holds consistently across other datasets and diverse methods.

Notably, we observed differential sensitivity to sampling strategies across model types. Retrieval-based models exhibit greater sensitivity to data quality compared to end-to-end regression models, resulting in larger performance gaps between different sampling strategies. This can be attributed to the nature of contrastive learning, which is highly dependent on the quality and diversity of training samples. Our SCRL sampling strategy balances biological quality and diversity. Specifically, single-cell references ensure that sampled spots cover critical cell subpopulations, while spatial density information guides the sampling process to preserve morphological diversity. This dual constraint enables SCRL to achieve stable and consistent performance improvements across both training paradigms.


\begin{figure*}[!htb]
\centering\includegraphics[width=0.7\textwidth]{Figure_pdf/Figure4_compare.pdf}
    \caption{\textbf{Comparison of different sampling ratios (10-75\%).} Our SCR$^2$Net consistently achieves better metrics, especially under low sampling budget scenarios.}
    \label{fig:sampling_models}
\end{figure*}


\begin{table*}[thbp] 
\centering
\scriptsize
\renewcommand{\arraystretch}{1} 
\caption{\textbf{Performance comparison on gene expression prediction task.}  
The best performance is highlighted in \textcolor{orange}{\textbf{orange}} and second highest in \textcolor{lightblue}{\underline{blue}}, where we can observe that SCR$^2$Net outperforms the SOTAs across most metrics on most datasets.}
\begin{tabular}{
    p{0.15\linewidth}  % Model
     P{0.062\linewidth} P{0.062\linewidth} P{0.062\linewidth}  % Breast
     P{0.062\linewidth} P{0.062\linewidth} P{0.062\linewidth}  % HER2
     P{0.062\linewidth} P{0.062\linewidth} P{0.062\linewidth}  % Kidney
}

\toprule
\multirow{2}{*}{\textbf{Model}} & \multicolumn{3}{c}{\textbf{Breast Cancer}} & \multicolumn{3}{c}{\textbf{HER2}} & \multicolumn{3}{c}{\textbf{Kidney}} \\
\cline{2-10}
& MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ \\
\midrule

ST-Net     & 0.6318 & 0.6377 & 0.1592 & 0.9237 & 0.7559 & 0.2709 & 0.7460 & 0.6811 & 0.1851 \\
His2ST     & 0.6999 & 0.6682 & 0.0612 & 0.9928 & 0.8034 & 0.1045 & 0.7912 & 0.7080 & 0.0571 \\
HisToGene  & 0.6521 & 0.6486 & 0.1149 & 0.9702 & 0.8050 & 0.1392 & 0.8540 & 0.7373 & 0.1134 \\
EGN        & 0.6662 & 0.6558 & 0.1462 & \textcolor{lightblue}{\underline{0.8916}} & 0.7640 & 0.2524 & 0.7574 & 0.6864 & 0.1632 \\
TRIPLEX    & 0.6672 & 0.6590 & 0.1093 & 0.9356 & 0.7752 & 0.2167 & \textcolor{lightblue}{\underline{0.7168}} & \textcolor{lightblue}{\underline{0.6692}} & 0.0930 \\
BLEEP      & \textcolor{lightblue}{\underline{0.6266}} & \textcolor{lightblue}{\underline{0.6044}} & \textcolor{orange}{\textbf{0.2041}} & 0.9507 & 0.7613 & \textcolor{lightblue}{\underline{0.2834}} & 0.8246 & 0.7167 & \textcolor{lightblue}{\underline{0.2020}} \\
mc1STExp   & 0.6472 & 0.6202 & 0.1645 & \textcolor{orange}{\textbf{0.8882}} & \textcolor{lightblue}{\underline{0.7367}} & 0.2651 & 0.7438 & 0.6759 & 0.1580 \\

SCR$^2$Net (Ours) & \textcolor{orange}{\textbf{0.5848}} & \textcolor{orange}{\textbf{0.5725}} & \textcolor{lightblue}{\underline{0.1940}} & 0.9139 & \textcolor{orange}{\textbf{0.7042}} & \textcolor{orange}{\textbf{0.3028}} & \textcolor{orange}{\textbf{0.7038}} & \textcolor{orange}{\textbf{0.6611}} & \textcolor{orange}{\textbf{0.2391}} \\
\bottomrule

\end{tabular}

\label{tab:gene_prediction_simplified}
\end{table*}



\subsection{Empirical Validation on Gene Expression Prediction}
We conducted four-fold cross-validation at the sample level to validate SCR$^2$Net against SOTAs. Table~\ref{tab:gene_prediction_simplified} summarizes quantitative comparisons across different cohorts. Our SCR$^2$Net outperforms existing methods in almost all metrics, achieving the lowest MSE and MAE $\downarrow$ as well as the highest PCC on most datasets. For example, on the Kidney dataset, SCR$^2$Net improves PCC by a clear margin to 0.2391, compared with prior baselines of 0.2020. Furthermore, as illustrated in Figure~\ref{fig:sampling_models} in Appendix~\ref{sec:sup_results}, SCR$^2$Net maintains strong predictive robustness under varying sampling budgets with our SCRL sampling strategy. The performance gap is pronounced under low sampling ratios (10–25\%), where other approaches suffer degradation due to sparse tissue coverage. In contrast, SCR$^2$Net mitigates this by acquiring informative tissue regions and leveraging retrieval-based auxiliary priors, resulting in a more reliable predictive performance with reduced sequencing costs.


\subsection{Ablation Study}
\noindent  \textbf{Reward Function in SCRL sampling.} We conducted an ablation analysis on the Biological Prior Reward and the Spatial Density Reward. As shown in Figure~\ref{fig:abl_sampling} in Appendix~\ref{sec:sup_results}, when using only Biological Reward, model performs well at low sampling ratios (10\% and 25\%); however, as the ratio increases, redundant samples limit further performance gains. Conversely, using only Spatial Reward yields results similar to random sampling, as it cannot directly assess the informativeness of samples. Combining both rewards ensures both biological quality and diversity and
allows the model to achieve optimal performance with a performance curve exhibiting a stable upward trend as the sampling ratio increases.

\begin{table}[thbp]
\centering
\scriptsize
\caption{\textbf{Stability of random initializations.}
We reported the performance comparison of our SCRL and random sampling under different random seeds, indicating that our sampling strategy achieves stable performance across different initializations.}
\begin{tabular}{
    p{0.07\linewidth}
    P{0.07\linewidth}
    P{0.105\linewidth} P{0.105\linewidth} P{0.105\linewidth}
    P{0.105\linewidth} P{0.105\linewidth} P{0.105\linewidth}
}
\toprule
\multirow{2}{*}{\textbf{Model}} & \multirow{2}{*}{\textbf{Ratio}} 
& \multicolumn{3}{c}{\textbf{SCRL (Ours)}} 
& \multicolumn{3}{c}{\textbf{Random}} \\
\cline{3-8}
& & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$
  & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ \\
\midrule
\multirow{3}{*}{ST-Net}
& 0.10 & 0.748$\pm$0.049 & 0.688$\pm$0.014 & 0.117$\pm$0.010
         & 0.905$\pm$0.111 & 0.754$\pm$0.042 & 0.097$\pm$0.018 \\
& 0.25 & 0.681$\pm$0.032 & 0.667$\pm$0.014 & 0.130$\pm$0.009
         & 0.720$\pm$0.046 & 0.674$\pm$0.018 & 0.126$\pm$0.010 \\
& 0.50 & 0.661$\pm$0.012 & 0.643$\pm$0.006 & 0.140$\pm$0.008
         & 0.694$\pm$0.035 & 0.673$\pm$0.018 & 0.134$\pm$0.007 \\
\midrule
\multirow{3}{*}{SCR$^2$Net}
& 0.10 & 0.679$\pm$0.015 & 0.659$\pm$0.016 & 0.132$\pm$0.003
         & 0.709$\pm$0.037 & 0.672$\pm$0.021 & 0.119$\pm$0.007 \\
& 0.25 & 0.642$\pm$0.011 & 0.642$\pm$0.017 & 0.139$\pm$0.007
         & 0.658$\pm$0.027 & 0.660$\pm$0.018 & 0.131$\pm$0.009 \\
& 0.50 & 0.617$\pm$0.009 & 0.629$\pm$0.006 & 0.165$\pm$0.010
         & 0.631$\pm$0.015 & 0.639$\pm$0.008 & 0.154$\pm$0.009 \\
\bottomrule
\end{tabular}
\label{tab:random_seed}
\end{table}
\noindent  \textbf{Functional Blocks in SCR$^2$Net.} As shown in Table~\ref{tab:ablation_scr2}, removing Retrieval Reference Module leads to an increase in MSE from 0.7038 to 0.7460 and a decrease in PCC $\uparrow$ from 0.2391 to 0.1851 on the Kidney dataset, which indicates its effectiveness in providing reference priors by incorporating similar spots.  Meanwhile, majority cell type filtering mechanism suppresses interference from noisy references by excluding low-quality retrieved spots.

\begin{figure*}[!htb]
    \centering
    \includegraphics[width=\textwidth]{Figure_pdf/Figure_reward_weight.pdf}
    \caption{\textbf{Sensitivity analysis of reward weight configurations under different sampling ratios.}
Each curve represents a distinct reward weight setting, with random sampling shown as a baseline.
When reward terms are placed at comparable scales, both models exhibit consistent performance trends as the sampling ratio increases.
In contrast, configurations dominated by a single reward term result in clear performance degradation, highlighting the necessity of balancing biological diversity and spatial coverage.}
    \label{fig:reward_weights}
\end{figure*}


\noindent \textbf{Sensitivity Analysis of Hyperparameters.} We tested different combinations of candidate pool size $K$, retained cell types $T$, and confidence mask threshold $m$ for the retrieval module. Results in Table~\ref{tab:ablation_scr2} indicate that overly small values of $K$ and $T$ (e.g., $K=10$, $T=3$) or a higher threshold $m$ limit the richness of reference information and reduce the number of effective reference spots. Conversely, overly large settings of  $K$ and $T$ or a lower threshold fail to effectively filter noisy matches, leading to performance degradation. Therefore, moderate hyperparameter settings achieve the optimal trade-off between suppressing noisy matches and preserving informative retrieval references.

\noindent  \textbf{Sensitivity analysis of reward weight.} We conduct a sensitivity analysis of the reward weight configurations by systematically controlling the weights of different reward terms at different scales to evaluate how scale differences affect sampling behavior and downstream prediction performance. Specifically, we construct multiple weight configurations, where some place all reward terms within a comparable scale (e.g., the original setting and nearby configurations), while others assign certain reward terms to significantly different orders of magnitude. As shown in Figure~\ref{fig:reward_weights} and Figure~\ref{fig:abl_sampling}, only when the reward terms are adjusted to comparable scales does the model maintain good and consistent performance; when a single reward term dominates or random sampling is adopted, the performance degrades noticeably. These results suggest that the effectiveness of the proposed reward design does not rely on precise weight tuning, but rather on maintaining a balanced contribution between biological diversity and spatial coverage.

\noindent \textbf{Stability under random initializations.}
We evaluate the stability of the proposed adaptive sampling strategy by repeating experiments with different random seeds under multiple sampling ratios. As shown in Table~\ref{tab:random_seed}, SCRL consistently achieves stable performance with small variance across runs and outperforms random sampling, indicating that the learned sampling behavior is robust to random initialization.


\noindent \textbf{Limitations and Challenges. }Despite these gains, overall accuracy remains bounded by intrinsic challenges of spatial transcriptomics rather than the sampling strategy alone. The most challenging regions for both sampling and prediction are typically highly heterogeneous or transitional tissue areas, as well as regions containing rare cell populations, which are often clinically important. In addition, morphologically similar regions may exhibit distinct molecular profiles, fundamentally limiting the predictive power of image-based models. Technical noise and spot-level signal averaging inherent to ST measurements further introduce uncertainty.

\begin{table*}[t]
\centering
\scriptsize
\caption{\textbf{Ablation study and hyperparameter sensitivity analysis in SCR$^2$Net,} where SCR²Net achieves the optimal results with all blocks, and a moderate hyperparameter setting provides the best balance between noise and information.}
\begin{tabular}{%
    p{0.15\linewidth}
    p{0.15\linewidth}
    P{0.062\linewidth} P{0.062\linewidth} P{0.062\linewidth}
    P{0.062\linewidth} P{0.062\linewidth} P{0.062\linewidth}
}
\toprule
\multicolumn{2}{c}{\multirow{2}{*}{\textbf{Functional Block} \& \textbf{Setting}}} &
\multicolumn{3}{c}{\textbf{Breast Cancer}} &
\multicolumn{3}{c}{\textbf{Kidney}} \\
\cline{3-8}
& & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ & MSE $\downarrow$ & MAE $\downarrow$ & PCC $\uparrow$ \\
\midrule

\multicolumn{2}{l}{w.o. Retrieval Reference Module}  &
0.6318 & 0.6377 & 0.1592 &
0.7460 & 0.6811 & 0.1851 \\

\multicolumn{2}{l}{w.o. Cell Type Filtering} &
0.6032 & 0.5992 & 0.1786 &
0.6952 & 0.6531 & 0.2235 \\

\multicolumn{2}{l}{w. All functional blocks}&
\textcolor{orange}{\textbf{0.5848}} & \textcolor{orange}{\textbf{0.5725}} & \textcolor{orange}{\textbf{0.1940}} &
0.7038 & \textcolor{orange}{\textbf{0.6611}} & \textcolor{orange}{\textbf{0.2391}} \\
\midrule

\multirow{3}{*}{Retrieval Module} 
& $K=10$, $T= 3$  &
0.6079 & 0.6198 & 0.1680 &
0.7236 & 0.6814 & 0.2187 \\
& $K=20$, $T= 5$  &
0.5912 & 0.6059 & 0.1711 &
\textcolor{orange}{\textbf{0.6886}} & 0.6643 & 0.2225 \\
& $K=100$, $T= 20$ &
\textcolor{orange}{\textbf{0.5848}} & 0.5925 & 0.1839 &
0.7239 & 0.6712 & 0.1977 \\
\midrule

\multirow{2}{*}{Confidence Mask}
& m = 0.05 &
0.6052 & 0.6035 & 0.1743 &
0.7135 & 0.6659 & 0.2102 \\
& m = 0.35 & 0.6233 & 0.6281 & 0.1590 & 0.7465 & 0.7005 & 0.1920 \\
\bottomrule
\end{tabular}
\label{tab:ablation_scr2}
\end{table*}
