Uncovering Latent Chain of Thought Vectors in Large Language Models
Track: tiny / short paper (up to 4 pages)
Keywords: Steering Vectors, Activation Engineering, Chain of Thought Reasoning, Interpretability
TL;DR: Using Layer Activations from Llama3 and Mistral, we derive injectable steering vectors to steer language models towards Chain of Thought thinking without the need for natural language prompting.
Submission Number: 31
Loading