Gaming with Monte Carlo Methods

Monte Carlo is one of the most popular and most commonly used algorithms in various fields ranging from physics and mechanics to computer science. The Monte Carlo algorithm is used in reinforcement learning (RL) when the model of the environment is not known. In the previous chapter, we looked at using dynamic programming (DP) to find an optimal policy where we know the model dynamics, which is transition and reward probabilities. But how can we determine the optimal policy when we don't know the model dynamics? In that case, we use the Monte Carlo algorithm; it is extremely powerful for finding optimal policies when we don't have knowledge of the environment.

In this chapter, you will learn about the following:

  • Monte Carlo methods
  • Monte Carlo prediction
  • Playing Blackjack with Monte Carlo
  • Model Carlo control
  • Monte Carlo exploration starts 
  • On-policy Monte Carlo control
  • Off-policy Monte Carlo control