Abstract: Scoring of handwritten essays in school education settings is a time-consuming task. Normalized assessment and prompt feedback enable a student to improve the articulation, comprehension and overall presentation of ideas. In this work, we present a system that can take in input as the images of the essay sheets and outputs the grade/score of the essay. We show a pipelined approach by combining a handwriting recognition model and automated essay scoring. Current handwriting recognition systems show an excellent transcription performance on the existing public domain dataset. These datasets are primarily captured in a constrained manner. The performance and efficacy of these models on unconstrained data are crucial for text understanding. In our work, we adapt an existing full-page handwriting recognition model to the unconstrained handwritten essay dataset. The full page handwriting recognition model is a deep learning model based on CNN and LSTM layers with explicit modules to identify the start of line, line normalization and text line recognition. The unconstrained dataset is from a national essay competition where students upload the essay after scanning the essay. This dataset is wild in nature as the background, margins, text-fonts and the scanning device make it challenging both visually and algorithmically.We have curated a subset of this dataset for all the experiments in this work and intend to make this dataset publicly available. We further analyze the performance on the downstream task of essay scoring using a set of classical handcrafted features and transformer-based contextual embeddings.We have formulated the problem of essay scoring as a regression task. The pre-trained embeddings/handcrafted features, for each essay, are used as representative features for the essay scoring model. Our results show that there is only a slight performance degradation in the essay scoring task due to transcription errors from the handwriting recognition module. We also show analysis with rubric level scores and handcrafted features to develop a subset of features that directly impact the rubric level score on the essay.
0 Replies
Loading