This is an implementation of the Easy21 assignment of David Silver's Reinforcement Learning Course at UCL. The assignment can be found here.
python3 mc.py
10 Million Episodes of the game have been evaluated, to obtain the following Value function:
python3 td.py
Mean Squared Error of the state-action function of the Monte-Carlo experiment with different Lambdas. For each lambda, 10 000 Episodes have been evaluated.
Mean Squared Error evolution with different Lambdas.
python3 lfa.py
The lookup table of the previous experiment is replaced with a linear function approximation. The logic for the feature vector can be found in the assignment.