
\section{Discussion and Conclusion}
Our study demonstrates that SRENet achieves superior inter- and intra-subject performance for unsupervised segmentation compared to conventional CNN and the state-of-the-art rotation equivariant baseline model. Although completely unsupervised, SRENet shows great potential to align closely with pathologist segmentations (Fig.~\ref{fig:path}), highlighting the importance of equivariant biomarkers in the analysis of histopathology images intrinsically lacking meaningful orientation.
Our method holds promise for identifying unsupervised equivariant biomarkers and has the potential to generalize effectively to other histopathology datasets. 

The intra-subject data suggests that our method could be a valuable tool for longitudinal tracking, which is particularly relevant for prostate cancer patients undergoing active surveillance where routine prostate biopsies are collected regularly. Consistent and equivariant biomarkers could be extracted from patient's each biopsy to quantitativel evaluate disease evolution or progression.
Furthermore, SRENet could enhance pathological analysis by offering consistent unsupervised segmentation, especially in light of the current variability among raters. This is evident in the TMA dataset used, where the agreement among expert pathologists varies significantly (Cohen's Kappa 0.38 to 0.70)~\cite{Karimi2020_TMAdata_6path}. 
Additionally, SRENet's capability extends to other imaging modalities, indicating its versatility and broad applicability in pathology and beyond. 
One limitation of this study is the reliance on pre-training using only NCT-CRC images. Domain shift between NCT-CRC colon and TMA prostate datasets may impact model performance. 

Other model architectures with potential for rotation equivariance are capsule-based \cite{Sabour_capsule} and vision transformer (ViT)~\cite{Dosovitskiy2020-lr} networks. Capsule-based networks achieve rotation equivariance through pose encoding and routing-by-agreement but suffer from high computational demands and optimization challenges due to dynamic routing~\cite{Peer_capsule,Mitterreiter_capsule}. 
As shown in Appendix Table~\ref{tab:nct_comparison}, the standard ViT's classification performance on the NCT-CRC dataset drops with rotated test images and performs worse than ResNet without rotation, likely due to the lack of inductive bias and the need for large training datasets, which are often scarce in medical imaging.
Replacing the linear projections in ViT with convolutional layers (as in Swin Transformer and Hybrid ViT) for feature extraction also loses rotation equivariance. Furthermore, ViT's positional encoding mechanism disrupts rotation equivariance by encoding positions relative to a fixed frame; even relative positional encoding maintains translational, not rotational, relationships~\cite{Chu_ViT}. 

Future work would involve training using prostate-specific datasets or pre-training with a diverse sample of diseases, similar to histopathologic foundation models, and comparing model performance to CNN trained with data augmentation. Moreover, validating SRENet with different feature resolutions at various layer depths of the encoder could further enhance its performance and applicability.
