Robot arm project

Algorithms:

DDDQN
DDPG
TD3 (DDPG with twin critic, with delay updates, with noise in target. It is easier to use TD3 than tune DDPG)
DQfD / DDPGfD
APEX-DQN / DDPG / TD3 (asynchronus train) implementation with Ray

Usage scenarios:

offpolicy_train.py - train model in RozumEnv with any algo which inherited from TDPolicy class
bc_train.py - train DQfD / DDPGfD, differs from offpolicy_train.py only with pretrain phase (TODO: refactor inheritance of behaviour clone classes)
apex_train.py / apex_bc.py - distributed versions of offpolicy_train.py and bc_train.py. Run Learner, Actor and Evaluate processes. Support resource defining for workers.
data_generate.py - distributed data collecting

Implementation details:

Two variants of buffer can be used in DQN and DDPG:

ccprb (C++ realization of replay buffers)	stable_baselines
Fast	Slow
Cause high memory usage without memory compress features (next_of; stack_compress and etc.). In some cases they can cause bugs.	Memory effective due to use of LazyFrames in frame_stack and absence of duplicating (next_/n_)state and state

Wrapper for cpprb-buffers and stablebaselines buffers supports dict observation spaces like: state = {'pov':{'shape':(64,64,3), 'dtype': 'uint8'}, 'angles': {'shape': (7), 'dtype':'float'}}
Vrep environment for Rozum robot uses PyRep API (instead of original VREP API)
Observation type option in environment ('pov'/('pov', 'angles')/'angles' and etc.)
If dtype_dict is specified, samplings in algorithms will be wrapped with tf.data.Dataset.from_generator, improving updates frequency
There are different make_model functions in algorithms/model.py. They can be accessed with get_network_builder(name) function. There is *_uni functions that can work with different combinations of obs_spaces in RozumEnv. Depending on space they build CNN/MLP blocks and concatenates them.

Name		Name	Last commit message	Last commit date
Latest commit History 743 Commits
algorithms		algorithms
common		common
configs		configs
environments		environments
nn_models		nn_models
replay_buffers		replay_buffers
README.md		README.md
apex_bc.py		apex_bc.py
apex_train.py		apex_train.py
bc_train.py		bc_train.py
data_generation.py		data_generation.py
gym_train.py		gym_train.py
offpolicy_train.py		offpolicy_train.py
rozum_pyrep.ttt		rozum_pyrep.ttt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algorithms

algorithms

common

common

configs

configs

environments

environments

nn_models

nn_models

replay_buffers

replay_buffers

README.md

README.md

apex_bc.py

apex_bc.py

apex_train.py

apex_train.py

bc_train.py

bc_train.py

data_generation.py

data_generation.py

gym_train.py

gym_train.py

offpolicy_train.py

offpolicy_train.py

rozum_pyrep.ttt

rozum_pyrep.ttt

Repository files navigation

Robot arm project

About

Releases

Packages

Languages

ermekaitygulov/RobotArm

Folders and files

Latest commit

History

Repository files navigation

Robot arm project

About

Resources

Stars

Watchers

Forks

Languages