{
       "Question number": "6",
       "Sub-Question number": "c",
       "Question": "State one advantage of policy iteration over value iteration for planning.",
       "Solution": "Policy iteration takes as most as many iterations to reach the optimal policy as value iteration, and in practice usually takes far fewer iterations. Policy iteration has a definite stopping condition: when the policy does not change after two suc- cessive iterations, the algorithm is completed. Policy iteration can also be modi- fied to take advantage of approximate solutions to the value function, particularly in problems with a large number of states in which the linear system cannot be solved practically by matrix inversion."
}