Abstract: In this study, we test several augmentation and distant supervision techniques to increase sentiment datasets in Russian. We use transfer learning approach pre-trained on created additional data to improve the performance. We compare our proposed approach based on distant supervision with existing augmentation methods. The best results were achieved using three-step approach of sequential training on general, thematic and original train samples. The results were improved by more than 3% to the current state-of-the-art methods for most of the benchmarks using data automatically annotated with distant supervision technique.
0 Replies
Loading