Keywords: probability, constraint, constraint learning, weak supervision, embedding, deep neural network
TL;DR: We introduce an embedding space approach to constrain neural network output probability distribution.
Abstract: Using higher order knowledge to reduce training data has become a popular research topic. However, the ability for available methods to draw effective decision boundaries is still limited: when training set is small, neural networks will be biased to certain labels. Based on this observation, we consider constraining output probability distribution as higher order domain knowledge. We design a novel algorithm that jointly optimizes output probability distribution on a clustered embedding space to make neural networks draw effective decision boundaries. While directly applying probability constraint is not effective, users need to provide additional very weak supervisions: mark some batches that have output distribution greatly differ from target probability distribution. We use experiments to empirically prove that our model can converge to an accuracy higher than other state-of-art semi-supervised learning models with less high quality labeled training examples.