Mako: Semi-supervised continual learning with minimal labeled data via data programming

Pengyuan Lu; Seungwon Lee; Amanda Watson; David Kent; Insup Lee; ERIC EATON; James Weimer

Mako: Semi-supervised continual learning with minimal labeled data via data programming

Pengyuan Lu, Seungwon Lee, Amanda Watson, David Kent, Insup Lee, ERIC EATON, James Weimer

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: lifelong machine learning, data programming, semi-supervised learning

Abstract: Lifelong machine learning (LML) is a well-known paradigm mimicking the human learning process by utilizing experiences from previous tasks. Nevertheless, an issue that has been rarely addressed is the lack of labels at the individual task level. The state-of-the-art of LML largely addresses supervised learning, with a few semi-supervised continual learning exceptions which require training additional models, which in turn impose constraints on the LML methods themselves. Therefore, we propose Mako, a wrapper tool that mounts on top of supervised LML frameworks, leveraging data programming. Mako imposes no additional knowledge base overhead and enables continual semi-supervised learning with a limited amount of labeled data. This tool achieves similar performance, in terms of per-task accuracy and resistance to catastrophic forgetting, as compared to fully labeled data. We ran extensive experiments on LML task sequences created from standard image classification data sets including MNIST, CIFAR-10 and CIFAR-100, and the results show that after utilizing Mako to leverage unlabeled data, LML tools are able to achieve $97\%$ performance of supervised learning on fully labeled data in terms of accuracy and catastrophic forgetting prevention. Moreover, when compared to baseline semi-supervised LML tools such as CNNL, ORDisCo and DistillMatch, Mako significantly outperforms them, increasing accuracy by $0.25$ on certain benchmarks.

One-sentence Summary: A tool mounted on top of supervised lifelong machine learning frameworks and enables semi-supervised learning, achieving very close performance to fully labeled data with only partially labeled is given.

11 Replies

Loading