- Abstract: Transfer learning has proven to be a successful way to train high performing deep learning models in various applications for which little labeled data is available. In transfer learning, one pre-trains the model on a large dataset such as Imagenet or MS-COCO, and fine-tunes its weights on the target domain. In our work, we claim that in the new era of ever increasing number of massive datasets, selecting the relevant pre-training data itself is a critical issue. We introduce a new problem in which available datasets are stored in one centralized location, i.e., a dataserver. We assume that a client, a target application with its own small labeled dataset, is only interested in fetching a subset of the server’s data that is most relevant to its own target domain. We propose a novel method that aims to optimally select subsets of data from the dataserver given a particular target client. We perform data selection by employing a mixture of experts model in a series of dataserver- client transactions with a small computational cost. We show the effectiveness of our work in several transfer learning scenarios, demonstrating state-of-the-art per- formance on several target datasets and tasks such as image classification, object detection and instance segmentation. We will make our framework available as a web-service, serving data to users trying to improve performance in their A.I. application.