GPT Czech Poet: Generation of Czech Poetic Strophes with Language Models

Anonymous

GPT Czech Poet: Generation of Czech Poetic Strophes with Language Models

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone

Abstract: High-quality automated poetry generation systems are currently only available for a small subset of languages. We introduce a new model for generating poetry in Czech, a heavily inflected Slavic language with rather regular orthography and prosody. We find that appropriate tokenization is crucial, showing that tokenization methods based on syllables or individual characters instead of subwords prove superior in generating poetic strophes. We also demonstrate that guiding the generation process by explicitly specifying strophe parameters within the poem text can improve the effectiveness of the model. We further enhance the results by introducing Forced Generation, adding explicit specifications of meter and verse parameters at inference time based on the already generated text. We evaluate a range of setups, showing that our proposed approach achieves high accuracies in several aspects of formal quality of the generated poems.

Paper Type: long

Research Area: Generation

Contribution Types: NLP engineering experiment, Approaches to low-resource settings

Languages Studied: Czech

0 Replies

Loading