Towards Modular Machine Learning Pipelines

Published: 16 Jun 2023, Last Modified: 17 Jul 2023ICML LLW 2023EveryoneRevisionsBibTeX
Keywords: pipeline, ML components, coupling, independently trainable, consistent, causal, modularity, regularizer
TL;DR: Pipelines of ML components are difficult to optimize in a distributed fashion; our modularity regularizers enable consistent distributed component updates.
Abstract: Pipelines of Machine Learning (ML) components are a popular and effective approach to divide and conquer many business-critical problems. A pipeline architecture implies a specific division of the overall problem, however current ML training approaches do not enforce this implied division. Consequently ML components can become coupled to one another after they are trained, which causes insidious effects. For instance, even when one coupled ML component in a pipeline is improved in isolation, the end-to-end pipeline performance can degrade. In this paper, we develop a conceptual framework to study ML coupling in pipelines and design new modularity regularizers that can eliminate coupling during ML training. We show that the resulting ML pipelines become modular (i.e., their components can be trained independently of one another) and discuss the tradeoffs of our approach versus existing approaches to pipeline optimization.
Submission Number: 23
Loading