Skip to content

ShuvenduRoy/Grid_world

Repository files navigation

Reinforcement Learning algorithm simulation with Grid world

Optimal value function

find the optimal value for each grid cell

Deterministic

Stochastic

Q value iteration

Here we save the the value for each state action pair. Which is defining how good an action is by taking the action and how much we can get from the state we land in.

Policy evaluation

In this case instead of finding the max value over all action. we will take the value for the defined policy
So the only difference in the equation is the absence of max() python policy_evaluation.py

About

Reinforcement learning implementation test with grid world visual interface

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages