Style Vectors for Steering Generative Large Language Models

Anonymous

Style Vectors for Steering Generative Large Language Models

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone

Abstract: This research explores strategies for \textit{steering} the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding \textit{style vectors} to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of \textit{activation engineering} using such \textit{style vectors} to influence the style of generated text in a nuanced and parameterisable way, which distinguishes it from prompt engineering. This presented research constitues a significant step towards the development of more adaptive and affective AI-empowered interactive systems.

Paper Type: long

Research Area: Dialogue and Interactive Systems

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Theory

Languages Studied: English

Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.

0 Replies

Loading