{
       "Semester": "Spring 2019",
       "Question Number": "4",
       "Part": "c",
       "Points": 2.0,
       "Topic": "CNNs",
       "Type": "Text",
       "Question": "Conne von Lucien has many pictures from her trip to Flatland and wants to determine which ones have her in the image. All of the pictures are arrays of size 4x1, with array values of either 0 or 1. Conne looks like the vector [1,0,1] in one dimension, so if a picture contains the pattern [1,0,1] anywhere inside it, it should be classified as a positive example, otherwise as a negative example.\nFortunately, you learned about CNNs and have helped Conne by designing the following network architecture with three layers:\n1. A convolutional layer with one filter W that is size 3x1, and stride 1, and a single bias w_0 (where the output pixel corresponds to the input pixel that the filter is centered on). Input values of 0 should be assumed beyond the boundaries of the input.\n2. A max-pooling layer P with size 2x1 and stride 2.\n3. A fully connected layer $\\sigma(\\cdot)$ with a single output unit having a sigmoidal activation function.\nWe can express the loss function as $L(\\sigma(P), y)$ where $P$ is the output from the max pooling layer of the CNN and $y$ is the true label for the input. Given $\\frac{d L}{d P}$, derive the update rule for $w_{1}$ if the filter is composed of $W=\\left[w_{1}, w_{2}, w_{3}\\right]^{T}$ with bias $w_{0}$, and step size is $\\eta$.",
       "Solution": "Consider $Z$ to be the outputs of layer $1, Z=\\left[z_{1}, z_{2}, z_{3}, z_{4}\\right]^{T}$.\n\n$$\n\\begin{aligned}\nz_{1} &=w_{1} \\cdot 0+w_{2} x_{1}+w_{3} x_{2}+w_{0} \\\\\nz_{2} &=w_{1} x_{1}+w_{2} x_{2}+w_{3} x_{3}+w_{0} \\\\\nz_{3} &=w_{1} x_{2}+w_{2} x_{3}+w_{3} x_{4}+w_{0} \\\\\nz_{4} &=w_{1} x_{3}+w_{2} x_{4}+w_{3} \\cdot 0+w_{0} \\\\\nP &=\\left[p_{1}, p_{2}\\right]^{T} \\\\\np_{1} &=\\max \\left(z_{1}, z_{2}\\right) \\\\\n\\frac{d p_{1}}{d w_{1}} &=0 \\text { if } z_{1}>z_{2} \\text { else } x_{1} \\\\\np_{2} &=\\max \\left(z_{3}, z_{4}\\right) \\\\\n\\frac{d p_{2}}{d w_{1}} &=x_{2} \\text { if } z_{3}>z_{4} \\text { else } x_{3} \\\\\n\\frac{d P}{d w_{1}} &=\\left[\\frac{d p_{1}}{d w_{1}}, \\frac{d p_{2}}{d w_{1}}\\right]^{T} \\\\\nw_{1} &:=w_{1}-\\eta \\frac{d L^{T}}{d P} \\quad \\frac{d P}{d w_{1}}\n\\end{aligned}\n$$\n"
}