Investigating Linguistic Steering: An Analysis of Adjectival Effects Across Large Language Model Architectures
Abstract: Achieving reliable control of Large Language Models (LLMs) requires a precise, scalable understanding of how they interpret linguistic cues. We introduce a rigorous framework using Shapley values to quantify the steering effect of individual adjectives on model performance, moving beyond anecdotal heuristics to principled attribution. Applying this method to 100 adjectives across a diverse suite of models (including o3, gpt-4o-mini, phi-3, llama-3-70b, and deepseek-r1) on the MMLU benchmark, we uncover several critical findings for AI alignment. First, we find that a small subset of adjectives act as disproportionately powerful "levers," yet their effects are not universal. Cross-model analysis reveals a "family effect": models of a shared lineage exhibit correlated sensitivity profiles, while architecturally distinct models react in a largely uncorrelated manner, challenging the notion of a one-size-fits-all prompting strategy. Second, focused follow-up studies demonstrate that the steering direction of these powerful adjectives is not intrinsic but is highly contingent on their syntactic role and position within the prompt. For larger models like gpt-4o-mini, we provide the first quantitative evidence of strong, non-additive interaction effects where adjectives can synergistically amplify, antagonistically dampen, or even reverse each other's impact. In contrast, smaller models like phi-3 exhibit a more literal and less compositional response. These results suggest that as models scale, their interpretation of prompts becomes more sophisticated but also less predictable, posing a significant challenge for robustly steering model behavior and highlighting the need for compositional and model-specific alignment techniques.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We thank the action editor and reviewers for the constructive feedback. This revision addresses all three requirements
from the decision:
1. New benchmark beyond MMLU (Reviewer 7G8b, Reviewer iPTK). We conducted a pilot study on the ARC-Challenge benchmark
(Clark et al., 2018) using phi-3 and llama-3-70b-instruct, two models from the main experiment, with the same
Shapley framework. Results are reported in a new Section 4.5 ("Cross-Benchmark Validation") and Table 2. The long-tail
distribution of adjective impact generalizes across benchmarks (Gini coefficients of 0.089 vs. 0.084 for phi-3; 0.127
vs. 0.390 for llama-3-70b), while the specific adjective rankings are entirely benchmark-specific (Spearman rho =
0.002 for phi-3, rho = 0.019 for llama-3-70b). An additional run with llama-3-8b-instruct reveals a faint
within-family lineage signal on ARC (rho = 0.182, p = 0.069 between 8b and 70b) that is absent across families,
extending the lineage effect finding to cross-scale comparisons.
2. Quantitative persona effect and cost-precision tradeoff (Reviewer DckK, Reviewer iPTK). Section 4.3.1 now includes
Table 1, reporting mean Shapley values under three prompt templates (original, suffix, persona) for all five models'
top-5 adjectives. The persona template reverses the sign of the effect in 9 out of 25 adjective–model combinations. A
new Section 5.2 ("Computational Cost and Practical Feasibility") provides a concrete accounting of the ~1.4 million
inference calls and discusses the precision–cost tradeoff of the 200-coalition KernelSHAP approximation.
3. Causal language and adjective selection (Reviewer DckK, Reviewer 7G8b). We audited the manuscript and softened
causal language throughout (abstract, introduction, results, discussion, conclusion), replacing terms like "steer" and
"steering effect" with associational framing such as "observed influence" and "measured effect." An explicit
correlational disclaimer has been added to the Limitations section. Section 3.3.1 now explains the adjective selection
methodology.
All code is publicly available at https://github.com/lmlearning/linguisticsteering.
Supplementary Material: zip
Assigned Action Editor: ~Huazheng_Wang1
Submission Number: 6670
Loading