Abstract: Manual annotation of qualitative research data is costly and time-consuming. Recently, machine learning approaches have been introduced to assist such tasks. However, it remains challenging for a machine learning model to incorporate context, data scarcity, data imbalance and other aspects in thematic analysis. We employed transfer learning, combining the pre-trained models BERT and ResNet to propose an annotation assistance model and evaluated its accuracy and efficiency for semi-automatic annotation. We experimented on a dataset of focus group discussions between researchers and participants on perception towards robots in public spaces. We tested various training methods, including few-shot learning, data augmentation, and the use of different data modalities, to evaluate the impact of dataset size, data balance, and data modality on the proposed annotation assistance model’s performance. The best-performing model achieved an average balanced accuracy of 59.89% for predicting thematic labels in researcher sentences and 48.67% for participant sentences.
Paper Type: long
Research Area: Semantics: Sentence-level Semantics, Textual Inference and Other areas
Languages Studied: English
0 Replies
Loading