{
       "Question number": "5",
       "Sub-Question number": "4b",
       "Question": "Given a distribution $P$ you can sample a training set $D$ and obtain a classifier $h$. Imagine you train $m$ such classifiers $h_{1}, \\ldots, h_{m}$ on $m$ data sets $D_{1}, \\ldots, D_{m}$, each drawn i.i.d. from the data distribution $P$. As you increase $m$ from $m=1$ to $m \\gg 0$, what happens to the variance of $\\hat{h}$ in the limit, $m \\gg 0$ ?",
       "Solution": "By the weak law of large numbers the average $\\hat{h}$ will approach the expected classifier $\\bar{h}$ as $m \\gg 0$ and $E_{\\mathbf{x}, D}\\left[\\left(h_{D}(\\mathbf{x})-\\bar{h}(\\mathbf{x})\\right)^{2}\\right] \\rightarrow$ 0 ."
}