SIT215 Artificial And Computational Intelligence: Reinforcement Learning

This repository contains the code associated with the Deakin SIT215 Investigating Reinforcement Learning Project.

Agents and Environments

Three OpenAI Gym environments were used:

Three agents have been developed:

From my research, Temporal difference (TD) learning is a category of reinforcement learning approaches that includes QLearning. The TD Learner is an implementation of the SARSA algorithm.

Training and evaluation runs for combinations of agent and environemnt are marshalled through a driver.

The following runs are available:

Taxi, Random Agent
Taxi, QLearner Agent
Cartpole, Random Agent
Cartpole, QLearner Agent
Cartpole, TDLearner Agent
Frozen Lake, Random Agent
Frozen Lake, QLearner Agent
Frozen Lake, TDLearner Agent

Setup

This project was built on macOS in Python 3. Dependency management was simplified using pipenv.

Follow the instructions below to set up this project locally:

macOS

Clone or download this repository to your local machine using the green Clone or download button on GitHub
Install Python 3 using brew install python3
Install pipenv using brew install pipenv
Install all remaining dependencies using pipenv install

A list of project dependencies is contained in the Pipfile.

Other platforms

If you are on another platform, refer to the Python download instructions for your OS.

Further assistance is available in this guide.

Pip3 is likely installed with Python, depending on your platform. Pip3 can be used to install the dependencies found in the Pipfile from the command line, specifying it as the requirements source: pip3 install -r Pipfile.

If you prefer to install dependencies individually, The specific version of each required is visible in the Pipfile.lock, eg:

        "gym": {
            "hashes": [
                "sha256:6baf3f3b163e237869d92a64daeaa88f14f62bb1105863e45312505a19dbd652"
            ],
            "index": "pypi",
            "version": "==0.10.5"

Pip3 can install individual dependencies using: pip3 install gym==0.10.5

Alternatively, if you are using a packaged distribution such as Anaconda, use your package management tool to to install the relevant version of required dependencies, referring to the Pipfile and Pipfile.lock.

Verification

To verify installation, a tool has been provided.

To run this tool using pipenv, from the root of the project use: pipenv run python3 demos/environment_test.py.

To run without pipenv, use python3 demos/environment_test.py.

If your environment is set up correctly, you will see a window with a random agent operating the cartpole environment and observations from the environment printed to the console.

Usage

To use this project, first choose which combination of environment and agent you would like to see training and evaluation results for. To set this, edit the run.py script in the root of the project - you will need to uncomment the function that triggers the run you are interested in.

Then, to begin the run using pipenv, call pipenv run python3 run.py.

Or without pipenv, python3 run.py.

You will see a progress percentage printed on the console as the agent trains:

Once the agent has finished training, training results will be presented alongside evaluation results in two graphs. The graphs show the total reward per episode accumulated with the current state of the agents training, over time:

Once you close this window, a demonstration of the trained agent will run. This will look different depending on the results of the particular environments render method.

You can either close this window when you are done, or enter the Y character in the command prompt to see another demo. Enter any other character to exit.

An interactive shell is also available (if using pipenv) with dependencies preloaded: pipenv shell.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
demos		demos
images		images
results		results
src		src
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
SIT215 - 2018 - Project Instructions.pdf		SIT215 - 2018 - Project Instructions.pdf
SIT215 Project - Philip Castiglione.pages		SIT215 Project - Philip Castiglione.pages
SIT215 Project - Philip Castiglione.pdf		SIT215 Project - Philip Castiglione.pdf
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demos

demos

images

images

results

results

src

src

.gitignore

.gitignore

Pipfile

Pipfile

Pipfile.lock

Pipfile.lock

README.md

README.md

SIT215 - 2018 - Project Instructions.pdf

SIT215 - 2018 - Project Instructions.pdf

SIT215 Project - Philip Castiglione.pages

SIT215 Project - Philip Castiglione.pages

SIT215 Project - Philip Castiglione.pdf

SIT215 Project - Philip Castiglione.pdf

run.py

run.py

Repository files navigation

SIT215 Artificial And Computational Intelligence: Reinforcement Learning

Agents and Environments

Setup

macOS

Other platforms

Verification

Usage

About

Releases

Packages

Languages

PhilipCastiglione/SIT215_Project

Folders and files

Latest commit

History

Repository files navigation

SIT215 Artificial And Computational Intelligence: Reinforcement Learning

Agents and Environments

Setup

macOS

Other platforms

Verification

Usage

About

Resources

Stars

Watchers

Forks

Languages