Showing 21 to 25 of 72 posts.

Today I started implementing an n-step Q-learning algorithm to find the optimal solution of the mountain-climb problem. I also use a one-layer neural network approximator for Q function estimation.

Plugging neural networks into RL architectures has some challenges which will be summarized in this blog post.

Today I looked into Theano library to compare it with Tensorflow.

I finished the coding an RBF approximation Q-learning for mountain climb problem in Open AI gym. 


Today I wrote two classes as part of the solution for mountain climb problem. One class can evaluate a state, update its value and apply the epsilon-greedy method. The other class is to simulate a random episode and apply Q-learning on it.