Exploring the Efficacy of Meta-Learning: Unveiling Superior Data Diversity Utilization of MAML Over Pre-training
Keywords: Data Quality, Data Diversity, Meta-Learning
TL;DR: argue that meta-learning is better at taking advantage of the diversity of the data than pre-training/SL as the diversity improves because of the R^2 value that in general diversity improves performance with an overall positive correlation
Abstract: Currently, data and model size dominates the narrative in the training of super-large, powerful models. However, there has been a lack of exploration on the effect of the quality of the training dataset on performance. In this work, we show positive correlations between accuracy and data diversity, providing an argument for the research of data "quality" beyond size.
In our analysis of pre-training and model-agnostic meta-learning methods on twelve popular visual datasets (e.g., Omniglot, CIFAR-FS, Aircraft) and five model configurations, including MAML variants with different numbers of inner gradient steps and supervised learning, we show moderate to strong positive correlations (R-squared: 0.15-0.42) between accuracy and data diversity, and weaker but significant correlations (R-squared: ~0.2) between loss and diversity. These findings support our hypothesis and pave the way for deeper exploration of how data quality, captured by diversity, influences model performance. This initial study highlights the potential of Task2Vec diversity as a valuable measure in the rapidly evolving field of large-scale learning, where understanding data quality is key to building more powerful and generalizable models.
Primary Subject Area: Other
Paper Type: Extended abstracts: up to 2 pages
Participation Mode: Virtual
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 82
Loading