SepLL: Separating Latent Class Labels from Weak Supervision NoiseDownload PDF

Anonymous

18 Jul 2022OpenReview Anonymous Preprint Blind SubmissionReaders: Everyone
Keywords: Machine Learning, Natural Language Processing, Weak Supervision, Latent Prediction
Abstract: Weak Supervision is a common approach that aims to tackle the need for large labeled datasets. In this setting, \emph{labeling functions} automatically assign heuristic, often noisy, labels to data samples. In this work, we provide a method for learning from weak labels by separating two types of mutually exclusive information associated with the labeling functions: information related to the target label and information specific to one labeling function only. Both types of information are reflected to different degrees by all labeled instances. In contrast to previous works that aimed at correcting or removing wrongly labeled instances, we learn a branched deep model that uses all data as is, but splits the labeling function information in the latent space. Specifically, we propose the end-to-end model \emph{SepLL} which extends a transformer classifier by introducing a latent space for labeling function specific and task-specific information. The learning signal is only given by the labeling functions matches, no pre-processing or label model is required for our method. Notably, the task prediction is made from the latent layer without any direct task signal. Experiments on the Wrench text classification tasks show that our model is competitive with the state-of-the-art, and yields a new best average performance.
0 Replies

Loading