For running simple enviornments such as:
- `CartPole-v0`: Solved when policy obtains a reward of at least 195.0 over 100 consecutive trials.
