Skip to content

Numpy & Keras based re-implementation of basic RL-algorithms: DP, VI, PI, SARSA, Q-Learning, DQN

Notifications You must be signed in to change notification settings

tvjoseph/RL-algorithms

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning Algorithms Implementations

KTH Reinforcement Learning (EL2805) 2019 coding assignments. As all my other repos, this is more an exercice for me to understand the algorithms than useful code. Hope it also helps you!

LAB 1

Dynamic Programming in finite fully-observable stochastic MDP

Agent (green) escaping (blue) a maze with walls (black) with a monster (red) following a uniform random walk capable of crossing walls: code

Value Iteration in infinite fully-observable stochastic MDP

Agent (green) robbing banks (blue) while escaping a police (red) which follows a random walk, never going away from him: code

SARSA (following epsilon-greedy policy) in infinite non-observable stochastic MDP

Policy learned by the agent for every Police (red) position: code

Q-Learning (from uniform policy) in infinite non-observable stochastic MDP

Agent (green) robbing again banks (blue) while escaping a police (red) who follows a random walk: code

About

Numpy & Keras based re-implementation of basic RL-algorithms: DP, VI, PI, SARSA, Q-Learning, DQN

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 78.2%
  • Jupyter Notebook 21.5%
  • Shell 0.3%