Abstract: Plotting a learner’s generalization performance against the training set size results in a
so-called learning curve. This tool, providing insight in the behavior of the learner, is also
practically valuable for model selection, predicting the effect of more training data, and
reducing the computational complexity of training. We set out to make the (ideal) learning
curve concept precise and briefly discuss the aforementioned usages of such curves. The
larger part of this survey’s focus, however, is on learning curves that show that more data does
not necessarily leads to better generalization performance. A result that seems surprising to
many researchers in the field of artificial intelligence. We point out the significance of these
findings and conclude our survey with an overview and discussion of open problems in this
area that warrant further theoretical and empirical investigation.
0 Replies
Loading