Speechworthy Instruction-tuned Language ModelsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Text and spoken language are two major modes of human communication, each with distinct constraints and preferences.However, current instruction-tuned language models are exclusively trained with textual preference annotations, which result in models that are not optimized for generating text bound for speech.We collect preference data for speech by asking users to \textit{listen} to response pairs, revealing nuanced preferences for simple and colloquial responses that are also informative.To optimize for speech-suitability, we compile system prompts applying guidelines established by audio-based media as well as curate a spoken-preference dataset of 11K voice-suitable input prompts with 20K annotated response pairs.We use these resources to adapt Falcon-Instruct 7B to the speech domain via both prompting and reinforcement learning with human feedback (RLHF).Human and automatic evaluation show that both approaches can improve speech-suitability, with RLHF on speech-based preferences in particular producing a model whose responses are preferred 4.8\% more on average to those of the base model and its prompted counterpart.
Paper Type: long
Research Area: Dialogue and Interactive Systems
Contribution Types: Data resources
Languages Studied: English
0 Replies

Loading