Keywords: HSK Text Grading; Deep Hybrid Neural Network; Multi-scale Feature Extraction; Bidirectional LSTM; Multi-head Self-Attention Mechanism
Abstract: Thispaperproposesadeephybridneuralnet-workmodelforHSKtextgrading,aimingto accuratelyassessthediffcultylevelsofChi-nesetexts.Themodeladoptsahierarchical processingarchitecture,includingacharacter embeddinglayer,amulti-scalefeatureextrac-tionmodule,abidirectionalLSTMsequence encodingmodule,amulti-headself-attention mechanismmodule,andaclassifcationout-putmodule.Byorganicallyintegratingscale convolutionalfeatureextraction,sequencemod-eling,andattentionmechanisms,themodelef-fectivelycapturescomplexlinguisticcharacter-isticsforprofciencyassessment.Weconstruct adatasetof4,474textsbasedonthelatestHSK standardtextbooksandsampleexams,aligned withtheInternationalChineseEducationChi-neseProfciencyLevelStandard(HSK9),us-ingacombined“HSK9+sentencelength”and “maximumdistributiondensitysegmentation”method.Experimentsshowthatthemodel achievesanaccuracyof94.17%andaweighted F1scoreof94.17%,signifcantlyoutperform-ingbaselinemodelsincludingBERT(92.32%),MLF-BERT(91.05%),andKGMNN(80.36%).Ablationstudiesconfrmthecontributionsof eachcomponent,withtheCNNmodulecon-tributing4.85percentagepointstoaccuracy andtheattentionmechanismcontributing4.59 percentagepoints.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: HSK Text Grading; Deep Hybrid Neural Network; Multi-scale Feature Extraction; Bidirectional LSTM; Multi-head Self-Attention Mechanism
Contribution Types: Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 4868
Loading