Keywords: Large Language Models, semantic anchoring, in-context learning, thresholds, geometry
TL;DR: UCCT says LLMs are pattern repositories that gain meaning only via external anchors.
Abstract: We propose *semantic anchoring*, a unified account of how LLMs turn pretrained capacity into goal-directed behavior: external structure binds latent patterns to a target. Unified Contextual Control Theory (UCCT) formalizes this with an anchoring strength $S=\rho_d-d_r-\log k$, where $\rho_d$ is within-target cohesion, $d_r$ is prior–target mismatch, and $k$ is the anchor budget. UCCT anticipates threshold-like flips and \emph{strictly generalizes} in-context learning, while reading retrieval-augmented generation and light fine-tuning as anchoring variants. Evidence comes from three studies: E1 shows cross-domain anchoring in text and vision where coherent anchors rebind strong priors; E2 varies representational familiarity via numeral bases at fixed computational complexity and observes ordered few-shot thresholds, transition widths, and transfer trade-offs that track $\rho_d/d_r$ and $S$; E3 analyzes layer-wise geometry and finds a geometry-to-behavior *correlate*, with peak anchoring and normalized area summarizing trajectories and correlating with internal shot midpoints. UCCT offers a testable lens and practical proxies for prompts, retrieval, and light tuning, while motivating targeted causal probes.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 2730
Loading