{
       "Semester": "Spring 2019",
       "Question Number": "2",
       "Part": "b",
       "Points": 1.333333333,
       "Topic": "Decision Trees",
       "Type": "Text",
       "Question": "Consider the following 2D dataset in (x,y) format: ((1,-1), +1), ((1,1),  +1), ((1,2.5),+1), ((2,-2),-1), ((2,1),+1),((2,3),+1),((5,-1),-1),((5,-2),-1). We will construct a tree using a greedy algorithm that recursively minimizes weighted average entropy. Recall that the weighted average entropy of a split into subsets A and B is: (fraction of points in $A) \\cdot H\\left(R_{j, s}^{A}\\right)+($ fraction of points in $B) \\cdot H\\left(R_{j, s}^{B}\\right)$ where the entropy $H\\left(R_{m}\\right)$ of data in a region $R_{m}$ is given by $H\\left(R_{m}\\right)=-\\sum_{k} \\hat{P}_{m k} \\log _{2} \\hat{P}_{m k}$. The $\\hat{P}_{m k}$ is the empirical probability, which is in this case the fraction of items in region $m$ that are of class $k$. Some facts that might be useful to you: H(0) = 0, H(3/5) = 0.97, H(3/8) = 0.95, H(3/4) = 0.81, H(5/6) = 0.65, H(1) = 0. \nDraw the decision tree boundaries represented by the following decision tree on a plot:\nx_2 < 0\nYes branch:\n    x_1 < 1.5\n    Yes branch: +1\n    No branch: -1\nNo branch: +1",
       "Solution": "x_2 = 0, x_1 = 1.5 for x_2 <= 0 (https://cdn.mathpix.com/cropped/2022_06_01_4b45961d5bf942e8929cg-05.jpg?height=367&width=896&top_left_y=722&top_left_x=236)"
}