January 14th, 2019

Day 26: Monte Carlo Policy Evaluation in Noisy Environment

Today, I wrote a code that takes into account the noisy environment behavior. 

 

The main difference of this code with a noiseless Monte Carlo is that in a noisy environment the chosen policy can deviate because of noise. Thus, while moving from one step to another, we need to define a probability of choosing an action based on policy and choosing other actions with the remainder of the probability. 

Check the following block of code for more details:

We can see that the agent plays action a with a probability of 0.5 and other actions with probability of 0.5.

 

I coded along with Lazy Programmer: https://www.udemy.com/artificial-intelligence-reinforcement-learning-in-python

 

Find my codes here: https://github.com/AidinFerdowsi/Monte-Carlo