AutoCustomization: A Unified Framework for Effortless, Selective LLM Bias and Style Finetuning

Jaroslaw Kochanowicz; Mateusz Olko; Gracjan Góral; Konrad Szewczyk; Krzysztof Dziedzic; Piotr Miłoś

AutoCustomization: A Unified Framework for Effortless, Selective LLM Bias and Style Finetuning

Jaroslaw Kochanowicz, Mateusz Olko, Gracjan Góral, Konrad Szewczyk, Krzysztof Dziedzic, Piotr Miłoś

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, model customization

TL;DR: We develop AutoCustomization a method to selectively customize models to align with specific expectations, such as tone, formality, or underlying biases.

Abstract: Large language models are transforming the landscape of applications, with their influence poised to expand. One important practical challenge is how to selectively customize models to align with specific expectations, such as tone, formality, or underlying biases. To solve this task, we develop AutoCustomization. The key to our approach is leveraging the vast knowledge encoded in modern language models to construct fine-tuning datasets focused on a specific customization axis in contrast to prior methods, which depend primarily on tediously constructed libraries of prompts. AutoCustomization demonstrates several desirable properties. It is universally applicable to any bias axis (e.g., political, stylistic). It is efficient with small automatically generated datasets and short fine-tuning. It allows for precise monitoring of the resulting bias change with our BiasShift evaluation metric proven to be alligned with human perception, generalizable to held-out aspects, and selective in preserving other model capabilities. We verify AutoCustomization through human evaluation and show that it outperforms existing prompting techniques while being simpler.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10311

Loading