Enhancing Chinese Essay Discourse Logic Evaluation Through Optimized Fine-Tuning of Large Language Models

Published: 01 Jan 2024, Last Modified: 19 May 2025NLPCC (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to the high complexity and diversity of writing, automated essay evaluation systems face significant challenges. Large language models (LLMs), representing the latest peak in NLP technology for semantic understanding, hold immense potential for advancing essay evaluation systems. In the NLPCC 2024 Shared Task 4 Chinese Essay Discourse Logic Evaluation and Integration, we investigated improving LLMs’ capabilities in evaluating essay logic, coherence, and quality. Considering the characteristics of different tasks, we adopted MRC-style instructions to optimize output formats and implemented undersampling to address data imbalance. To enhance efficiency and model performance, we explored LLM fine-tuning methods that decouple tasks and applied similarity comparison to refine model outputs. Additionally, we utilized noisy embedding fine-tuning to mitigate overfitting. Our approach achieved the top ranking in the NLPCC 2024 Shared Task 4.
Loading