{
       "Semester": "Spring 2019",
       "Question Number": "6",
       "Part": "f",
       "Points": 1.75,
       "Topic": "Classifiers",
       "Type": "Text",
       "Question": "After taking 6.036, Bob decides to train a recommender system to predict what ratings different customers will give to different movies. Currently, he knows of three really popular movies, and he knows of two potential customers who have ranked some of these movies. The data matrix currently looks like: $Y=[[2, ?, 3],[4,2, ?]]$ where, as in class, rows correspond to customers and columns correspond to movies, and ? indicates a missing or unknown ranking. He decides to find a low rank factorization of $Y$ using the alternating least squares algorithm implemented in class. Assume for this question that offsets are set to $0.$\nBob is happy about what he has accomplished, until he realizes that there are a bunch of movies and users that he still needs to add to his database! He sees that his database will slowly grow over time, and that it will be time-consuming to train a completely new model every single time he updates his database. If Bob has an $m \\times n$ data matrix which he wants to find a rank $k$ factorization for, his analysis indicates that the worst-case run-time (in terms of number of expensive multiplications) of performing alternating least squares for $t$ iterations (where each iteration updates both $U$ and $V)$ will be $O(k^{2}*m*n*t)$.\nInstead, Bob comes upon the following idea: whenever he gets information about a new movie, he adds an extra row to $V$ but does not alter the existing entries of $U$ or $V$. He then finds the values of the entries in that extra row that minimize the objective function (with no regularization). He performs a similar procedure when he gets a new user, but instead adds an extra row to $U$. \nBob continues using this update scheme whenever he adds new movies and users. Does the order in which Bob receives new information affect the final values of $U$ and $V$ that he learns? Explain.",
       "Solution": "Yes. Let us say Bob gets information about movie $k$ and person $a$ in that order. Based on this new update scheme, the row in $V$ corresponding to movie $k$ will be frozen after the information is received, and will not be updated when the information about person $a$ is received. On the other hand, the learned row in $U$ corresponding to person $a$ will depend in part on the previously updated row in $V$ corresponding to movie $k$.\n\nIf the information was received in the opposite order, we would have the opposite result. The row in $U$ corresponding to person $a$ would be frozen after the first piece of information was received, and not be influenced by the information about movie $k$. Meanwhile, the row in $V$ corresponding to movie $k$ would be learned in part based on the information gained about person a previously.\n\nThus, the order of new information matters a lot in this new scheme, because $U$ and $V$ aren't jointly optimized completely every time new information is received.\n"
}