{
       "Question number": "5",
       "Sub-Question number": "3",
       "Question": "In neural networks bagging can be performed without random subsampling of the data. i.e., one trains m neural networks independently and ensembles their results. Can you explain why the subsampling is unnecessary in this case?",
       "Solution": "The random initialization and non-convexity of neural networks ensures that independently trained models will end up in different local minima and obtain different results. The effect is similar to training on slightly different data sets. minimum 1 point if say something and show effort. (+2) if state that NN has random initialization. (+1) if state NN converges to local minimum due to non-convexity. special case (1): if mentioned that Stochastic Gradient Descent randomly sample training data and lead to different weights, get 2 points. special case (2): if mentioned that layers such as dropout is some embedded randomness, get also 2 points."
}