\begin{abstract}
Transformer models exhibit significant performance degradation when exposed to noisy inputs, yet the mechanisms underlying this vulnerability remain poorly understood. We present a systematic layer-wise analysis of noise robustness across five encoder-only transformer architectures (BERT, RoBERTa, ALBERT, DistilBERT, ELECTRA) using 300,000 perturbed samples. Our analysis identifies critical transitions at layers 3 and 8 that correspond to boundaries between distinct processing phases: surface feature extraction (layers 0-3), syntactic processing (layers 3-8), and semantic encoding (layers 8-12). We find that RoBERTa maintains 98.8\% performance under noise conditions where ELECTRA retains only 52.7\%, with character-level perturbations showing 85\% recovery rates compared to 22\% for syntactic disruptions. Cross-model analysis reveals 61.1\% correlation in vulnerability patterns, suggesting universal architectural properties. Based on these findings, we propose strategic layer dropout at identified transition points, achieving theoretical speedup of 3.1× with 95\% performance retention. While our analysis is limited to encoder architectures and English text, these results provide insights for developing robust NLP systems and suggest directions for phase-aware architectural designs. We acknowledge that runtime measurements and decoder architecture analysis remain important areas for future investigation.
\end{abstract}