- 0856071 謝秉瑾
- 0856108 謝宗祐
- 0856160 洪鈺恆
In this project we will parallelly train multiple reinforcement learning agents to play taiko.
Most reinforcement learning environment uses OpenAi atari gym environment. Because OpenAi gym had already parallelized well for each environment, all we have to do is parallelly collect data from each environment. However, in the real case, we have to parallelly control each environment and prevent data race and so on. It is more complex than using OpenAi atari gym environment.
- python threadind
- tensorflow
use deep q-learning playing atari policy proximal optimization
We can parallelly train and play taiko.
- 10/29~11/13 Do research on taiko web and reinforcement methods.
- 11/13~11/20 Choose which reinforcement method to use and implement single thread version for this project.
- 11/20~11/27 Find which part of code can be parallelized.
- 11/27~12/10 Implement parallel training of taiko.
- 12/10~ Train agents.
https://github.com/bui/taiko-web https://github.com/openai/baselines
- pip3 install -r requirements.txt
- offline GAME
- single thread running(4500 episodes, converge time => 15:31~17:32)
- multi-thread(3500 episodes, converge time => 14:29~15:25)
- play on taiko-web
- Final report