Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Keywords: rhetorical parallelism, sequence labeling, NLP, Latin, Chinese, resource
TL;DR: This paper provides a complete introduction for the task of rhetorical parallelism detection, including discussions of two datasets, four metrics, and many modeling baselines.
Abstract: Rhetoric, both spoken and written, involves not only content but also style. One common stylistic tool is $\textit{parallelism}$: the juxtaposition of phrases which have the same sequence of linguistic ($\textit{e.g.}$, phonological, syntactic, semantic) features. Despite the ubiquity of parallelism, the field of natural language processing has seldom investigated it, missing a chance to better understand the nature of the structure, meaning, and intent that humans convey. To address this, we introduce the task of $\textit{rhetorical parallelism detection}$. We construct a formal definition of it; we provide one new Latin dataset and one adapted Chinese dataset for it; we establish a family of metrics to evaluate performance on it; and, lastly, we create baseline systems and novel sequence labeling schemes to capture it. On our strictest metric, we attain F$_1$ scores of $0.40$ and $0.43$ on our Latin and Chinese datasets, respectively.
Submission Number: 4901
Loading