I implemented Q-Learning, Policy-Iteration and Value-Iteration for a MDP Environment. The Algorithms are as suggested by the book "Reinforcement Learning: An Introduction" by Sutton and Barto.
w1nte/reinforcement-learning-presentation
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
example for a presentation about RL.