Study on HSK Chinese Text Grading Based on Deep Hybrid Neural Network Model

Study on HSK Chinese Text Grading Based on Deep Hybrid Neural Network Model

ACL ARR 2026 January Submission4868 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: HSK Text Grading; Deep Hybrid Neural Network; Multi-scale Feature Extraction; Bidirectional LSTM; Multi-head Self-Attention Mechanism

Abstract: Thispaperproposesadeephybridneuralnet-workmodelforHSKtextgrading,aimingto accuratelyassessthediffcultylevelsofChi-nesetexts.Themodeladoptsahierarchical processingarchitecture,includingacharacter embeddinglayer,amulti-scalefeatureextrac-tionmodule,abidirectionalLSTMsequence encodingmodule,amulti-headself-attention mechanismmodule,andaclassifcationout-putmodule.Byorganicallyintegratingscale convolutionalfeatureextraction,sequencemod-eling,andattentionmechanisms,themodelef-fectivelycapturescomplexlinguisticcharacter-isticsforprofciencyassessment.Weconstruct adatasetof4,474textsbasedonthelatestHSK standardtextbooksandsampleexams,aligned withtheInternationalChineseEducationChi-neseProfciencyLevelStandard(HSK9),us-ingacombined“HSK9+sentencelength”and “maximumdistributiondensitysegmentation”method.Experimentsshowthatthemodel achievesanaccuracyof94.17%andaweighted F1scoreof94.17%,signifcantlyoutperform-ingbaselinemodelsincludingBERT(92.32%),MLF-BERT(91.05%),andKGMNN(80.36%).Ablationstudiesconfrmthecontributionsof eachcomponent,withtheCNNmodulecon-tributing4.85percentagepointstoaccuracy andtheattentionmechanismcontributing4.59 percentagepoints.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: HSK Text Grading; Deep Hybrid Neural Network; Multi-scale Feature Extraction; Bidirectional LSTM; Multi-head Self-Attention Mechanism

Contribution Types: Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 4868

Loading