bandits

Solutions to the multi-armed bandits problems described in book Reinforcement Learning: An Introduction by Sutton and Barto.
Includes the following algorithms for learning optimal strategies (see Agents.py)-

epsilon greedy algorithm
softmax algorithm

Results:

![/graphs/nArmedBanditAvgRewardsComparison.png](/graphs/nArmedBanditAvgRewardsComparison.png?raw=true "varying epsilon in epsilon greedy: avg. reward vs iterations") ![/graphs/eGreedyvsSoftmax.png](/graphs/eGreedyvsSoftmax.png?raw=true "Epsilon-Greedy vs Softmax performance")

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
graphs		graphs
.gitignore		.gitignore
Agents.py		Agents.py
Bandits.py		Bandits.py
README.md		README.md
TestBeds.py		TestBeds.py
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graphs

graphs

.gitignore

.gitignore

Agents.py

Agents.py

Bandits.py

Bandits.py

README.md

README.md

TestBeds.py

TestBeds.py

main.py

main.py

utils.py

utils.py

Repository files navigation

bandits

Results:

About

Releases

Packages

Languages

sudhanshumittal/bandits

Folders and files

Latest commit

History

Repository files navigation

bandits

Results:

About

Resources

Stars

Watchers

Forks

Languages