Least Squares Policy Iteration in Python
License
stober/lspi
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Least Squares Policy Iteration in Python Author: Jeremy Stober Contact: stober@gmail.com Version: 0.1 This is a Python implementation of LSPI from Lagoudakis and Parr in JMLR (2003). The code depends on having an environment that provides a function phi for generating features from state-action pairs, and a function linear_policy for evaluating the policy. The gridworld package (https://github.com/stober/gridworld) provides example environments. Both lspi.py and lstdq.py contain example code using a simple chainworld environment from the original paper (included in the gridworld package).
About
Least Squares Policy Iteration in Python
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published