Skip to content

teopir/ifqi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iFQI

iFQI is a toolkit for solving Reinforcement Learning problem using Fitted Q-Iteration

Contents of this document

Installation

You can perform a minimal install of ifqi with:

git clone https://github.com/teopir/ifqi.git
cd ifqi
pip install -e .

Installing everything

To install the whole set of features, you will need additional packages installed. You can install everything by running pip install -e '.[all]'.

What's new

  • 2016-XX-YY: Initial release

How to set and run and experiment

Prepare a configuration file

First of all you need a json file where its described how the experiment will be performed. When you'll run the experiment with a same json configuration file, you'll get exactly the same outcome. In the configuration file you'll define

  1. The environemnt
  2. Which regressors will you use (you can define many of them in the same experiment, with different parameters)
  3. You'll define the number of dataset that you'll use to learn. The datasets will be generated from your selected environment, with random policy.
  4. You can choose the dataset sizes: the size correspond to the number of episodes that will be used to compose your datasets. Remember that if you will insert more than one size, in your experiment you will generate n dataset with size1, n dataset with size2, and so on..
  5. You can choose the number of repetitions: if the repetition is equal to 1, than one regressor will perform the learning procedure only once, but the fitting of a regressor is often stochastic, so here there is the possibility to run the learning procedure over the same dataset different number of times and collect all the outcomes
  6. You can choose how often you would like to evaluate your learning procedure (so how many iteration you vould like to run the policy found and collect the scores). You can also choose how many episodes run every times to evaluate the policy found: if the environment is deterministic, you will set this number to one, but will be a good idea to evaluate your policy more than once if your environment is stochastic
  7. The number of FQI iterations (this will depend more or less in how many iteration you belive you'll find a good policy.) If you have a lot of iterations (like 100), it is wise to don't evaluate your policy every iteration, but something like once every 5 iteration for example. The evaluation of a policy is an expensive procedure.

To generate the configuration file you need just to call

python examples/jsonWriter.py

and follow the instructions. Than you will be asked to insert the name of the file: just type the name of the file you wish, for example "configurations/myConfFile.json". The folder "configurations" in this case, is not needed to exists: it will be automatically generated by the program.

Run the experiment

All you have to do is just prepare the json file as described above, and then call

python examples/experimentThreadManager.py experimentName configFilename threadNumber -d -l

The experimenName is the name that you give to your experiment: this will be even the name of the folder where will be saved the outcome of the experiment. Will be a good idea to give the same name of the json file to avoid confusion (a copy inof the configuration file will be copied in any case inside the folder). The configuration file is the name of the configuration file as described above. As ExperimentThreadManager (as the name suggest) can run a different number of thread to exploit thread level parallelism, you can set threadNumber 1 if you don't want to have parallelism, or more than one if you want to use more than one thread (ideally you will set this parameter equal to the number of cores you would like to use) The parameter -d of --diary is optional. If you set it, you'll keep track of your experiment in a special json file "diary.json". If you want just to run an experiment just for a trial don't set -d, but if you want to keep track of it and then see it on a nice HTML page, than set -d. The parameter -l or -addLast have a meaning only if -d is present: -l will merge the actual experiment with the last experiment you have performed. In this way will be possible to have plot where you compare the performance on different environment, even if a single experiment allow the execution of only one environment. If you set -l the outcome of your experiment will be in experimentName but you will not see experimentName in you diary, because this last experiment will be merged with the last one (you could merge more than two experiment).

You could visualize real-time the progression of your present experiment by typing in console

python examples/monitor.py

monitor.py will ask you just the experimentName. Just remember to run this script after that your experiment run the 1st FQI iteration. This issue will be corrected, a legend and confidence intervals will be added to the plot in future.

How to plot the outcome and how to add them to the diary and generate diary.html

Your experiment has finished and you would like to generate some plot. The class

python examples/variableLoadSave.py

will help you to retreive your data and plot the graphic you wish. If you want to plot a very generic plot (or to see how to plot a graphic)

python examples/plot.py

Plot.py will ask to you the name of your experiment and will generate a very generic plot. It will ask you also whether you want to add your plot to the diary or not. If yes you'll be asked to insert a short comment. If you will like to add a plot or a picture of you wish inside the diary, just open the diary.json. It will be something like this:

[{"jsonFile": ["configuration/Config1.json"], "name": "Experiment1", 
"importance": "1", "images": [], "postComment": "", "date": "12-10-2016 15:31:44", "description": "boh1"},
{"jsonFile": ["configuration/Config2.json"], "name": "Experiment2", 
"importance": "1", "images": [], "postComment": "", "date": "12-10-2016 15:31:44", "description": "boh1"}]

If you need to add a picture to Experiment2 add your picture to plot/Experiment2/picture.jpg and modify the json as follow:

[{"jsonFile": ["configuration/Config1.json"], "name": "Experiment1", 
"importance": "1", "images": [], "postComment": "", "date": "12-10-2016 15:31:44", "description": "boh1"},
{"jsonFile": ["configuration/Config2.json"], "name": "Experiment2", 
"importance": "1", "images": [{"dir":"plot/Experiment2/picture.jpg","description":"picture1 ect","title":"picture1"}], "postComment": "", 
"date": "12-10-2016 15:31:44", "description": "boh1"}]

As you can see, you are not really forced to save you picture in plot/ExperimentName/name.jpg, but we strongly suggest to you to do like this, and to name the file with the same title reported in your json.

To generate the diary.html, is just necessary to type:

python examples/doc.py

provided that diary.json is in your folder.

About

Reinforcement Learning library

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published