SCOPES: Measuring Accuracy–Portability Trade-offs Across Microarray and RNA-Seq

Abdullah Nayem Wasi Emran; Tanveer Rahman

SCOPES: Measuring Accuracy–Portability Trade-offs Across Microarray and RNA-Seq

Abdullah Nayem Wasi Emran, Tanveer Rahman

Published: 02 Mar 2026, Last Modified: 08 May 2026MLGenX 2026 TinypapertrackEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Deep models often fail under distribution shift, yet the role of feature selection in amplifying or mitigating shift is underexplored. We study this in a stringent setting: transferring a tumour--vs--normal classifier across measurement platforms (Agilent microarray $\rightarrow$ RNA--Seq) using the same patients and genes. We introduce SCOPES, a leak--free, multi--objective selection framework that optimizes three competing goals: (i) predictive performance (AUC) via patient--safe cross--validation, (ii) selection stability (Kuncheva), and (iii) cross--platform alignment (Maximum Mean Discrepancy, MMD). Viewed through a representation lens, SCOPES selects a compact gene subspace that is simultaneously discriminative and domain-aligned, explicitly exposing the accuracy--portability frontier under measurement shift. On matched TCGA--BRCA Agilent/RNA--Seq, a label--informed $F$--score slab produced an implausibly perfect source model ($\mathrm{AUC} \approx 1.0$) but lost $\sim 0.30$ AUC after transfer, revealing selection leakage plus platform shift. Replacing the slab with an unsupervised MAD prefilter makes the trade--off explicit on the Pareto front: a one--gene, alignment--first solution achieves modest AUC with small transfer loss ($0.69 \rightarrow 0.61$, $\Delta \mathrm{AUC} \approx -0.08$), while a 30--gene, accuracy--first solution reaches near--perfect source AUC but transfers poorly ($\Delta \mathrm{AUC} \approx -0.38$). SCOPES provides a simple protocol to measure and control this trade--off (report source/target AUC, $\Delta$AUC, and MMD), encouraging selections near a Pareto ``knee'' for portability. Finally, in the reverse direction (RNA--Seq $\rightarrow$ microarray), a 37-gene SCOPES signature attains $AUC_{RNA} = 0.654 \ ({\rm CV})$ and $AUC_{Agilent} = 0.890$ ($\Delta {\rm AUC} = +0.236$), indicating directional shift. We argue that treating selection as a multi--objective design problem is a useful lens for the science of deep learning under shift.

Submission Number: 74

Loading