Abstract: Recent work has shown that transformer-based language models learn rich geometric structure in their embedding spaces, yet the presence of higher-level cognitive organization within these representations remains underexplored. In this work, we investigate whether sentence embeddings encode a graded, hierarchical structure aligned with human-interpretable cognitive or psychological attributes. We construct a dataset of 480 natural-language sentences annotated with both continuous energy scores (ranging from -5 to 5) and discrete tier labels spanning seven ordered consciousness-related cognitive categories. Using fixed sentence embeddings from multiple transformer models, we evaluate the recoverability of these annotations via linear and shallow nonlinear probes. Across models, both continuous energy scores and tier labels are reliably decodable by both linear and nonlinear probes, with nonlinear probes outperforming linear counterparts. To assess statistical significance, we conduct nonparametric permutation tests that randomize labels while preserving embedding geometry, finding that observed probe performance significantly exceeds chance under both regression and classification null hypotheses (p < 0.005). Qualitative analyses using UMAP visualizations and tier-level confusion matrices are consistent with these findings, illustrating a coherent low-to-high gradient and predominantly local (adjacent-tier) confusions in embedding space. Taken together, these results provide evidence that transformer embedding spaces exhibit a hierarchical geometric organization statistically aligned with our human-defined cognitive structure; while this work does not claim internal awareness or phenomenology, it demonstrates a systematic alignment between learned representation geometry and interpretable cognitive and psychological attributes, with potential implications for representation analysis, safety modeling, and geometry-based generation steering.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Andrew_Kyle_Lampinen1
Submission Number: 6883
Loading