Abstract: In this paper, we report on some experiments aimed at exploring the relation between
document-level and sentence-level readability
assessment for French. These were run on
an open-source tailored corpus, which was
automatically created by aggregating various
sources from children’s literature. On top of
providing the research community with a freely
available corpus, we report on sentence readability scores obtained when applying both classical approaches (aka readability formulas) and
state-of-the-art deep learning techniques (e.g.
fine-tuning of large language models). Results
show a relatively strong correlation between
document-level and sentence-level readability,
suggesting ways to reduce the cost of building
annotated sentence-level readability datasets.
0 Replies
Loading