Rethinking Personalized Natural Language Generation with the PersonaSocialNorms Corpus and Ranking EvaluationDownload PDF

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone
Abstract: Personalized language generation is playing an increasingly significant role in language technologies. Persona-based generation is a personalization approach that conditions generation of descriptive sentences about an individual and has been shown to successfully emulate language characteristic of individuals with these traits. This is a challenging task to design, model, and evaluate, and as such, early work in this area approached the problem with constraints to simplify the problem. We argue that the way forward requires modifications to these restrictions in three key areas; (1) realistic conversational data, (2) representative and diverse persona sentences, and (3) modified ranking evaluation. We present an extension of the Social-Chem-101 corpus, the PersonaSocialNorms corpus, which contains a collection of Reddit posts about social situations and written judgements from others stating that the actions taken by the original poster are right or wrong. Our corpus contains a collection of 95K judgements written by 6K authors filtered from the Social-Chem-101 corpus. We extend the data with 20-500 persona sentences for each author. By using more realistic data, we find previous persona consistency metrics inadequate for evaluation. We provide a novel ranking evaluation and implement several architectures inspired by recent work, showing promising results and room for improvement.
Paper Type: long
Research Area: Generation
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Position papers
Languages Studied: English
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies

Loading