KPC-cF: Korean Aspect-Based Sentiment Analysis via Pseudo-Classifier with Corpus Filtering for Low Resource Society

30 Jan 2024 (modified: 07 Feb 2024)AAAI 2024 Workshop ASEA SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Korean aspect-based sentiment analysis, Pseudo-Classifier with Corpus filtering, Addressing language gap issue for Low Resource Society (Transfer from high-resource languages to low-resource languages)
TL;DR: We addressed the language gap issue in ABSA by building a pseudo-classifier. This involved fine-tuning an NLI model with translated data, performing LaBSE scoring on Korean NLI pairs, and further fine-tuning with optimal pseudo-labels.
Abstract: Investigations into Aspect-Based Sentiment Analysis (ABSA) for Korean restaurant reviews are notably lacking in the existing literature. Our research proposes an intuitive and effective framework for ABSA in low-resource languages such as Korean. It optimizes prediction labels by integrating translated Benchmark and unlabeled Korean data. Using a model fine-tuned on translated data, we pseudo-labeled the actual Korean NLI set. Subsequently, we applied LaBSE and MSP-based filtering to this pseudo NLI set, enhancing its performance through additional training. Incorporating dual filtering, this model bridged dataset gaps, achieving positive results in Korean ABSA with minimal resources. Through additional data filtering and injecting pipelines, our approach aims to provide a cost-effective framework (e.g., human intervention and training resources) for data and model construction within communities, whether corporate or individual, in low-resource language countries. Compared to English ABSA, our framework showed an approximately 3% difference in F1 scores and accuracy. We will show the model and data for Korean ABSA, publicly available at https://github.com/namkibeom/KPC-cF.
Submission Number: 3
Loading