Overview

In this project you had to train two agents of tennis players, that try to keep the ball in the game as long as possible.

The base for this project and algorithm was layed down by the Udacity team from the DRLND, which we were allowed to use and to adapt. I followed the tip of the team to use the DDPG algorithm with a shared ReplayBuffer and a shared Actor Network.

Setup

To start, you need Python 3.6, PyTorch, Numpy, Matplotlib, and the Unity ML-Agents Toolkit.

With Anaconda or virtualenv you can create your python environment like: $conda create -n drlnd python=3.6 pytorch matplotlib numpy

For the Unity ML-Agent you need to download the Toolkit (https://github.com/Unity-Technologies/ml-agents) go to the ml-agents/python directory and install it via: $pip install ml-agents/python

Instructions

Watch the trained agent with: $ python watch.py

and let the agent train again with: $ python train.py

Environment

The observation space for each agent consists of 8 values and the action space for each agent consists of 2 values, which are both continuous. The Environment is considered solved, after taking the maximum score of each player for each episode and then taking the mean of the last 100 scores reaches 0.5 or higher for one episode.

'Graph showing me solving the environment'

This algorithm even succeeded 0.7 over 100 consecutive episodes once and then bounced around the 0.5 mark. The environment was first considered solved at around 700 episodes.

Architecture and Algorithms

Can be viewed in the "REPORT.md" file next to this file.

Future

The agent so far is pretty good in a very short learning cycle. Although additional improvements can be made, like the Generalized Advantage Estimation(GAE) to improve the prediction of the expected return.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Tennis_Linux		Tennis_Linux
checkpoints		checkpoints
drl		drl
.gitignore		.gitignore
README.md		README.md
REPORT.md		REPORT.md
train.py		train.py
watch.py		watch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tennis_Linux

Tennis_Linux

checkpoints

checkpoints

drl

drl

.gitignore

.gitignore

README.md

README.md

REPORT.md

REPORT.md

train.py

train.py

watch.py

watch.py

Repository files navigation

Overview

Setup

Instructions

Environment

Architecture and Algorithms

Future

About

Releases

Packages

Languages

AndreHenkel/DRLND_Tennis

Folders and files

Latest commit

History

Repository files navigation

Overview

Setup

Instructions

Environment

Architecture and Algorithms

Future

About

Resources

Stars

Watchers

Forks

Languages