{
       "Semester": "Fall 2021",
       "Question Number": "6",
       "Part": "b",
       "Points": 2.0,
       "Topic": "Neural Networks",
       "Type": "Image",
       "Question": "Years ago, MIT student Itu Nes learned about neural networks and how to train them, from taking 6.036. Now Itu is an engineer at Orange Computer, a hot tech company employing machine learning to revolutionize music. Looking back at her notes, Itu realizes that she once wrote down exactly what she now needs to do in her job, but unfortunately some key details are lost. Can you help her figure things out?\nSpecifically, Itu wants to train this simple single-node neural network:\nThe network accepts two inputs $x_{1}$ and $x_{2}$, and outputs a prediction $\\hat{y}$ based on weights $a$ and $b$. Itu's dataset has points $(x, y)$ where $x=\\left(x_{1}, x_{2}\\right)$, and $y$ are the true labels. Itu employs the squared error loss function\n$$\nL(\\hat{y}, y)=(y-\\hat{y})^{2}\n$$\nIn her notes, Itu wrote about using gradient descent to obtain the optimal weights for the network, by minimizing this loss. Moreover, for each run of the gradient descent, she used a single data point to train the weights. Afterwards, Itu learns that the true labels are $y=x_{1}+x_{2}$. \nSuppose $a_{0}$ and $b_{0}$ are the initisl values of the weights, and $a_{k}$ and $b_{k}$ are the weights at iteration $k$. Give equations for the updated weights $a_{k+1}, b_{k+1}$ in terms of current iteration's weights $a_{k}, b_{k}$, the step size parameter $\\eta$, and the inputs $x_{1}, x_{2}$.",
       "Solution": "Latex"
}