Skip to content

Reinforcement Learning Practice for Multi and Single-Agent Autonomous vehicle

Notifications You must be signed in to change notification settings

3neutronstar/flow_RL

Repository files navigation

flow_RL

Requirement(Installment)

Documentation for Flow

-English Ver: [DocumentPDF]https://drive.google.com/file/d/1NQRoCFgfIh34IJh4p0GqqOWagZh543X2/view?usp=sharing

-Korean Ver: [DocumentPDF]https://drive.google.com/file/d/1BUStOlq8LRypEmwXfRLD-_Xd04wnmCwL/view?usp=sharing

How to Download Requirement

Anaconda(Python3) installation:

  • Prerequisites
    sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
  • Installation(for x86 Systems) In your browser, download the Anaconda installer for Linux (from https://anaconda.com/ ), and unzip the file.
bash ~/Downloads/Anaconda3-2020.02-Linux-x86_64.sh

We recomment you to running conda init 'yes'.
After installation is done, close and open your terminal again.

Flow installation

Download Flow github repository.

    git clone https://github.com/flow-project/flow.git
    cd flow

We create a conda environment and installing Flow and its dependencies within the enivronment.

    conda env create -f environment.yml
    conda activate flow
    python setup.py develop

For Ubuntu 18.04: This command will install the SUMO for simulation.

scripts/setup_sumo_ubuntu1804.sh

For checking the SUMO installation,

    which sumo
    sumo --version
    sumo-gui

(if SUMO is installed, pop-up window of simulation is opened)

  • Testing your SUMO and Flow installation
    conda activate flow
    python simulate.py ring

Torch installation (Pytorch)

You should install at least 1.6.0 version of torch.(torchvision: 0.7.0)

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

Ray RLlib installation

You should install at least 0.8.6 version of Ray.(Recommend 0.8.7)

pip install -U ray==0.8.7
  • Testing RLlib installation
    conda activate flow
    python train_rllib.py singleagent_ring

If RLlib is installed, turn off the terminal after confirming that "1" appears in the part where iter is written in the terminal.

Visualizing with Tensorboard

To visualize the training progress:

tensorboard --logdir=~/ray_results singleagent_ring

If tensorboard is not installed, you can install with pip, by following command pip install tensorboard.

Downloads Flow-autonomous-driving repository

Download related file for training and visualizing:

cd 
git clone https://github.com/3neutronstar/flow-autonomous-driving.git

How to Use

RL examples

RLlib (for multiagent and single agent)

for PPO(Proximal Policy Optimization) and DDPG(Deep Deterministic Policy Gradient) algorithm

python train_rllib.py EXP_CONFIG --algorithm [DDPG or PPO]

where EXP_CONFIG is the name of the experiment configuration file, as located in directoryexp_configs/rl/singleagent.
In '[DDPG or PPO]', You can choose 'DDPG' or 'PPO' Algorithm.(Default: PPO)

Visualizing Training Results

If you want to visualizing after training by rllib(ray),

  • First, conda activate flow to activate flow environment.
  • Second,
    python ~/flow-autonomous-driving/visualizer_rllib.py 
    ~/home/user/ray_results/EXP_CONFIG/experiment_name/ number_of_checkpoints

experiment_name : Name of created folder when learning started.
number_of_checkpoints : It means the name of the checkpoint folder created in the experiment_name folder. Enter the checkpoint (number) you want to visualize.

Results for training Ring Network and Figure-Eight Network

PPO (Proximal Policy Optimization)

  • Ring Network (ring length 220-270 for training) image
    Mean velocity in 22 Non-AVs system: 4.22m/s (ring length: 260)
    Mean velocity in 1 AV, 21 Non-AVs system: 4.67m/s, Mean cumulative reward: 2350 (ring length: 260)
    Use Stochastic Sampling Exploration method
    Reward seems to converge in 2300, this result is regarded as success experiment.
  • Figure-eight Network image
    Mean velocity in 14 Non-AVs system: 4.019m/s (total length: 402)
    Mean velocity in 1 AV, 13 Non-AVs system: 6.67m/s (total length: 402)
    Use Gaussian Noise Exploration method
    Reward seems to converge in 19,000, this result is regarded as success experiment.
    Graph that is represented going back and forward penomenon is normal graph due to its failures.
    Having failure gives penalty to autonomous vehicle.

DDPG (Deep Deterministic Policy Gradient)

  • Ring Network(ring length 220-270 for training) image
    Mean velocity in 22 Non-AVs system: 4.22m/s (ring length: 260)
    Mean velocity in 1 AV, 21 Non-AVs system: 4.78m/s, Mean cumulative reward: 2410 (ring length: 260)
    Use Ornstein Uhlenbeck Noise Exploration method

  • Figure-eight Network will be added

non-RL examples

python simulate.py EXP_CONFIG

where EXP_CONFIG is the name of the experiment configuration file, as located in exp_configs/non_rl.

If you want to run with options, use

python simulate.py EXP_CONFIG --num_runs n --no_render --gen_emission

Figure_Eight Ring Ring Network, Figure-Eight Network(left, right)

OSM - Output (Open Street Map)

OSM_Combined

[OpenStreetMap]https://www.openstreetmap.org/

If you want to use osm file for making network, Download from .osm files. After that map.osm file should replace the same name of file in 'Network' directory. You want to see their results, run this code.

python simulate.py osm_test

After that, If you want to see those output file(XML), you could find in ~/flow/flow/core/kernel/debug/cfg/.net.cfg

Contributors

Minsoo Kang, Gihong Lee, Hyeonju Lim

About

Reinforcement Learning Practice for Multi and Single-Agent Autonomous vehicle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages