{
       "Semester": "Fall 2019",
       "Question Number": "7",
       "Part": "a",
       "Points": 2.0,
       "Topic": "RNNs",
       "Type": "Text",
       "Question": "We have seen in class recurrent neural networks ( $\\mathrm{RNNs}$ ) that are structured as:\n$$\n\\begin{aligned}\nz_{t}^{1} &=W^{s s} s_{t-1}+W^{s x} x_{t} \\\\\ns_{t} &=f_{1}\\left(z_{t}^{1}\\right) \\\\\nz_{t}^{2} &=W^{o} s_{t} \\\\\np_{t} &=f_{2}\\left(z_{t}^{2}\\right)\n\\end{aligned}\n$$\nwhere we have set biases to zero. Here $x_{t}$ is the input and $y_{t}$ the actual output for $\\left(x_{t}, y_{t}\\right)$ sequences used for training, with $p_{t}$ as the RNN output (during or after training).\nAssume our first RNN, call it RNN-A, has $s_{t}, x_{t}, p_{t}$ all being vectors of shape $2 \\times 1$. In addition, the activation functions are simply $f_{1}(z)=z$ and $f_{2}(z)=z$.\nFor $\\mathrm{RNN}-\\mathrm{A}$, give dimensions of the weights for W^{s s}, W^{s x}, and W^{0}",
       "Solution": "W^{s s} is 2x2, W^{s x} is 2x2, and W^{0} is 2x2"
}