ALiiCE: Evaluating Positional Fine-grained Citation Generation

ALiiCE: Evaluating Positional Fine-grained Citation Generation

ACL ARR 2024 August Submission214 Authors

15 Aug 2024 (modified: 26 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Model (LLM) can enhance its credibility and verifiability by generating text with citations. However, existing research on citation generation is predominantly limited to sentence-level statements, neglecting the significance of positional fine-grained citations that can appear anywhere within sentences. To facilitate further exploration of the positional fine-grained citation generation, we propose ALiiCE, the first automatic evaluation framework for this task. Our method employs a dependency tree based approach to parse the sentence-level claim into atomic claims. Then ALiiCE evaluates citation quality using three metrics, including positional fine-grained citation recall, precision, and coefficient of variation of citation positions. We evaluate the positional fine-grained citation generation performance of several LLMs on long-form QA datasets. Our experiments and analyses demonstrate the effectiveness and reasonableness of ALiiCE. We offer our insights into the current advancements and future directions for the positional fine-grained citation generation task.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: evaluation methodologies, metrics, automatic evaluation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 214

Loading