{
       "Question number": "6",
       "Sub-Question number": "d",
       "Question": "Consider an MDP with two states $S=\\{$ state 1 , state 2$\\}$ and two actions $\\{$ left, right $\\}$ and an RL agent with the following Q-values:\n$\\begin{array}{ccc} & \\text { left } & \\text { right } \\\\ \\text { state1 } & 6 & 4 \\\\ \\text { state } 2 & 2 & 3\\end{array}$\nSuppose the agent is in state1. What is the distribution over the action the agent takes when using an $\\epsilon$-greedy policy that explores with probability $\\epsilon>0$ ?\n",
       "Solution": "with prob $1-\\eps$ take action left otherwise take one of left and right uniformly at random"
}