Archaeology at MLSP 2024: Machine Translation for Lexical Complexity Prediction and Lexical Simplification

Published: 01 Jan 2024, Last Modified: 18 May 2025BEA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We present the submissions of team Archaeology for the Lexical Simplification and Lexical Complexity Prediction Shared Tasks at BEA2024. Our approach for this shared task consists in creating two pipelines for generating lexical substitutions and estimating the complexity: one using machine translation texts into English and one using the original language.For the LCP subtask, our xgb regressor is trained with engineered features (based primarily on English language resources) and shallow word structure features. For the LS subtask we use a locally-executed quantized LLM to generate candidates and sort them by complexity score computed using the pipeline designed for LCP.These pipelines provide distinct perspectives on the lexical simplification process, offering insights into the efficacy and limitations of employing Machine Translation versus direct processing on the original language data.
Loading