MAB

This is a fork from Sub-Sampling-Dueling-Algorithms-Neurips20.

How to run experiments

The __ main__.py file contains different block with code that can be directly executed. This file relies on xp_helpers.py, that contains functions that allow to run two types of experiments:

Frequentist experiments: the user defines a bandit model and perform a number of runs of each algorithm for this particular model
Bayesian experiments: the user defines a prior distribution for the bandit model and draw a number of experiments from this distribution. Then, each bandit algorithm runs once on these problems.

The file is divided in three blocks. The xp_type parameter allows to choose which block to run. Several examples are proposed in each blocks.

Code Structure

Bandit algorithms

Our implementation of the multi-arm bandit problem has its key structure in the MAB.py file. The initialization of the bandit relies on the arms.py file, which defines objects representing the arms and their properties (mean, how to sample the rewards, etc...).

The GenericMAB object is designed as a mother class for any bandit model. Several algorithms are already implemented in this class, when they don't have to be calibrated for specific distributions. The function MC_regret allows to run a single bandit algorithm for a given number of runs and time horizon and returns the regret.

The objects BernoulliMAB.py, GaussianMAB.py, ExponentialMAB.py and TruncatedGaussianMAB.py are inherited from GenericMAB and refine the class to adapt it to the Bernoulli, Gaussian, Exponential and Truncated Gaussian distributions. In particular, they contain the algorithms that are specific to the family of distribution of the arms, or optimized versions of algorithms that are alerady in GenericMAB (for instance in Bernoulli MAB).

Helpers

The Tracker2 object defined tracker.py is a useful object used in all of our bandit algorithms to store the settings of the experiments during the runs. In particular, it can be used to store the number of pulls, cumulated regret and reward history of each arm.

utils.py contains several functions that are useful in the bandit algorithms. Some of these function use the numba package for faster computation.

Finally, xp_helpers.py provide useful functions to perform large scale experiments in the frequentist and bayesian setting. Some of these functions use libraries that allow multiprocessing for parallel computation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
MAB		MAB
figures		figures
README.md		README.md
experiments_frequentist.ipynb		experiments_frequentist.ipynb
experiments_frequentist_npts.ipynb		experiments_frequentist_npts.ipynb
experiments_frequentist_npts_mean.ipynb		experiments_frequentist_npts_mean.ipynb
non_stationary_bandits.ipynb		non_stationary_bandits.ipynb
non_stationary_bandits_experiments.ipynb		non_stationary_bandits_experiments.ipynb
risk_averse_bandits.ipynb		risk_averse_bandits.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAB

MAB

figures

figures

README.md

README.md

experiments_frequentist.ipynb

experiments_frequentist.ipynb

experiments_frequentist_npts.ipynb

experiments_frequentist_npts.ipynb

experiments_frequentist_npts_mean.ipynb

experiments_frequentist_npts_mean.ipynb

non_stationary_bandits.ipynb

non_stationary_bandits.ipynb

non_stationary_bandits_experiments.ipynb

non_stationary_bandits_experiments.ipynb

risk_averse_bandits.ipynb

risk_averse_bandits.ipynb

Repository files navigation

MAB

How to run experiments

Code Structure

Bandit algorithms

Helpers

About

Releases

Packages

Languages

sauxpa/MAB

Folders and files

Latest commit

History

Repository files navigation

MAB

How to run experiments

Code Structure

Bandit algorithms

Helpers

About

Resources

Stars

Watchers

Forks

Languages