{
       "Semester": "Spring 2021",
       "Question Number": "5",
       "Part": "e",
       "Points": 1.0,
       "Topic": "Loss Functions",
       "Type": "Image",
       "Question": "We have looked at many machine-learning algorithms with hyper-parameters. Varying each of them has an effect on the loss on both the training data and on unseen testing data. What plot would describe the most typical behavior for the testing error dependency on the number of epochs of gradient-descent to perform: monotonically decreasing function, convex parabola, monotonically increasing function, monotonically decreasing step function, monotonically increasing step function? If none of them is appropriate, explain.",
       "Solution": "convex parabola"
}