The implementation of tabular solution methods in Reinformcement Learning, Sutton's book: Part I On-policy first-visit Monte Carlo Off-policy Monte Carlo Double Q-learning 2-Step Tree Back-up SARSA 2-Step Expected SARSA