Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models
Abstract: LLMs have demonstrated impressive proficiency
in generating coherent and high-quality text,
making them valuable across a range of textgeneration tasks. However, rigorous evaluation
of this generated content is crucial, as ensuring
its quality remains a significant challenge due
to persistent issues such as factual inaccuracies
and hallucination. This paper introduces three
fine-tuned general-purpose LLM auto-evaluators,
REC-8B, REC-12B and REC-70B, specifically
designed to evaluate generated text across several dimensions: faithfulness, instruction following, coherence, and completeness. These models not only provide ratings for these metrics
but also offer detailed explanation and verifiable
citation, thereby enhancing trust in the content.
Moreover, the models support various citation
modes, accommodating different requirements
for latency and granularity. Extensive evaluations on diverse benchmarks demonstrate that our
general-purpose LLM auto-evaluator, REC-70B,
outperforms state-of-the-art LLMs, excelling in
content evaluation by delivering better quality explanation and citation with minimal bias. Our
REC dataset and models are available at https:
//github.com/adelaidehsu/REC.
Loading