High-Dimensional Bayesian Optimization via Tree-Structured Additive Models (AAAI 2021)

Authors:

Eric Han
Ishank Arora
Jonathan Scarlett

This repository is accompanied by the AAAI 2021 publication, arXiv Pre-Print. Cite using our AAAI2021 article:

@article{Han_Arora_Scarlett_2021, 
title={High-Dimensional Bayesian Optimization via Tree-Structured Additive Models}, 
volume={35}, 
url={https://ojs.aaai.org/index.php/AAAI/article/view/16933},
abstractNote={Bayesian Optimization (BO) has shown significant success in tackling expensive low-dimensional black-box optimization problems. Many optimization problems of interest are high-dimensional, and scaling BO to such settings remains an important challenge. In this paper, we consider generalized additive models in which low-dimensional functions with overlapping subsets of variables are composed to model a high-dimensional target function. Our goal is to lower the computational resources required and facilitate faster model learning by reducing the model complexity while retaining the sample-efficiency of existing methods. Specifically, we constrain the underlying dependency graphs to tree structures in order to facilitate both the structure learning and optimization of the acquisition function. For the former, we propose a hybrid graph learning algorithm based on Gibbs sampling and mutation. In addition, we propose a novel zooming-based algorithm that permits generalized additive models to be employed more efficiently in the case of continuous domains. We demonstrate and discuss the efficacy of our approach via a range of experiments on synthetic functions and real-world datasets.}, 
number={9}, 
journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
author={Eric Han and Ishank Arora and Jonathan Scarlett}, 
year={2021}, 
month={May}, 
pages={7630-7638}}

Acknowledgements

The code in this repository are derived from the code base from High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups, supplied by Paul Rolland.
The code included in hdbo/febo is taken from LineBO. The paper accompanying the code is Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces.
The code included in hdbo/boattack is taken from BayesOpt_Attack. The paper accompanying the code is BayesOpt Adversarial Attack.
The NAS-Bench-101 datasets included in data/fcnet is taken from nas_benchmarks. The paper accompanying the code is NAS-Bench-101: Towards Reproducible Neural Architecture Search.
The lpsolve datasets included in data/mps is taken from the benchmark dataset in MIPLIB 2017.

Repository Structure

The code repository is structured as follows:

data: Folder containing the data files from NAS-Bench-101 and lpsolve.
hdbo: Folder containing our code
sample: Folder containing some sample configurations of the experiments are given below; Note that We ran many configurations with different random seeds, algorithms, and datasets.
- default.yml: Sample experiment to check if your environment is installed correctly.
- syn-add-d.yml: Sample synthetic additive (discrete) experiment
- syn-add-c.yml: Sample synthetic additive (continuous) experiment
- syn-fn.yml: Sample synthetic function experiment
- nas.yml: Sample NAS-Bench-101 experiment
- lp.yml: Sample lpsolve experiment
- ba-addgp.yml: Sample BayesOpt Attack experiment
test: Folder related to some of our unit tests
MLproject: MLflow project file, we are using mlflow to help organize our experiments
README.md: This readme
hdbo.yml: Conda file that is used internally by the MLflow project

Setup

We implemented all algorithms in Python 3.8.3. The Python environments are managed using Conda, and experiments are managed using MLflow, which allows convenient management of experiments. We have included code from data from external sources in our repository for the ease of setup.

Minimum System requirements:

Linux 5.4.12-100
5GB of space (Because of NAS-Bench-101 and lpsolve, if not just 100MB)

Prepare your environment:

Run data/setup.sh, it will download data from NAS Benchmarks and into your home directory. You may skip this step if you are not running lpsolve and NAS-Bench-101 datasets.
Install Conda
Install MLflow either on your system or in the base environment of Conda - pip install mlflow

Running

Check your installation by running mlflow run . in the base directory, to run with the test configuration default.yml, you should see something similar as output:

2020/02/06 15:16:12 INFO mlflow.projects: === Created directory /tmp/tmpya1pguie for downloading remote URIs passed to arguments of type 'path' ===
2020/02/06 15:16:12 INFO mlflow.projects: === Running command 'source activate mlflow-69104b8ea3ca24a4e45a47bc7581b825cfe12300 && LOGGING_TYPE=local python hdbo/main.py /home/X/Workspace/HD-BO-Additive-Models/sample/default.yml' in run with ID '9ca30276ec574aefbae92a2befc53d60' === 
06/02/2020 15:16:13 [INFO    ] [main.py:22] Platform: uname_result(system='Linux', node='X', release='5.4.12-100.fc30.x86_64', version='#1 SMP Wed Jan 15 00:38:53 UTC 2020', machine='x86_64', processor='x86_64')
06/02/2020 15:16:13 [INFO    ] [main.py:23] Processor: x86_64
06/02/2020 15:16:13 [INFO    ] [main.py:24] Python: 3.6.8/CPython
06/02/2020 15:16:13 [INFO    ] [main.py:27] Blas Library: ['blas', 'cblas', 'lapack', 'blas', 'cblas', 'lapack', 'blas', 'cblas', 'lapack']
06/02/2020 15:16:13 [INFO    ] [main.py:28] Lapack Library: ['blas', 'cblas', 'lapack', 'blas', 'cblas', 'lapack', 'blas', 'cblas', 'lapack']
06/02/2020 15:16:13 [INFO    ] [main.py:31] temp_dir: /tmp/tmph4iceyxt
06/02/2020 15:16:13 [INFO    ] [main.py:32] Host Name: X
06/02/2020 15:16:13 [INFO    ] [datasets.py:44] Load loader[Synthetic].
06/02/2020 15:16:13 [INFO    ] [datasets.py:70] Using synthetic dataset loader with graph_type[StarGraph].
06/02/2020 15:16:13 [INFO    ] [datasets.py:493] Graph Edges: [(0, 1), (0, 2), (0, 3)]
06/02/2020 15:16:13 [INFO    ] [datasets.py:321] Loading pre-computed function at /home/X/cache/2525f51218fc4e2827eab0aec1776a94abf895ca.pkl.
06/02/2020 15:16:13 [INFO    ] [datasets.py:325] Checking consistency of pre-compute.
06/02/2020 15:16:13 [INFO    ] [main.py:71] f_min = -5.562785863192289
06/02/2020 15:16:13 [INFO    ] [main.py:72] x_best = [[0.56 0.   0.16 0.74]]
06/02/2020 15:16:13 [INFO    ] [main.py:73] cost = 7803
06/02/2020 15:16:13 [INFO    ] [algorithms.py:41] Load algorithm loader[Algorithm].
06/02/2020 15:16:13 [INFO    ] [algorithms.py:54] Using algorithm with algorithm_id[Tree].
06/02/2020 15:16:13 [INFO    ] [gp.py:49] initializing Y
06/02/2020 15:16:13 [INFO    ] [gp.py:98] initializing inference method
06/02/2020 15:16:13 [INFO    ] [gp.py:107] adding kernel and likelihood as parameters
06/02/2020 15:16:17 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:18 [INFO    ] [function_optimizer.py:59] New graph : [(1, 3)]
06/02/2020 15:16:22 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:23 [INFO    ] [function_optimizer.py:59] New graph : [(0, 1), (0, 2), (0, 3)]
06/02/2020 15:16:29 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:30 [INFO    ] [function_optimizer.py:59] New graph : [(0, 1), (0, 2), (0, 3)]
06/02/2020 15:16:36 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:36 [INFO    ] [function_optimizer.py:59] New graph : [(0, 1), (0, 2), (0, 3)]
06/02/2020 15:16:43 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:43 [INFO    ] [function_optimizer.py:59] New graph : [(0, 1), (0, 2), (0, 3)]
06/02/2020 15:16:50 [INFO    ] [function_optimizer.py:481] Running Tree
06/02/2020 15:16:51 [INFO    ] [function_optimizer.py:59] New graph : [(0, 1), (0, 2), (0, 3)]
2020/02/06 15:16:57 INFO mlflow.projects: === Run (ID '9ca30276ec574aefbae92a2befc53d60') succeeded ===

You may run the configurations by using the command to specify the configuration, for example to run the synthetic additive (continuous) experiment:

mlflow run . -P param_file=sample/syn-add-c.yml

In order to visualize the experiment, you can run mlflow ui and click on the experiments to visualize the metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

hdbo

hdbo

sample

sample

test

test

.gitignore

.gitignore

LICENSE

LICENSE

MLproject

MLproject

README.md

README.md

hdbo.yml

hdbo.yml

Repository files navigation

High-Dimensional Bayesian Optimization via Tree-Structured Additive Models (AAAI 2021)

Acknowledgements

Repository Structure

Setup

Running

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
hdbo		hdbo
sample		sample
test		test
.gitignore		.gitignore
LICENSE		LICENSE
MLproject		MLproject
README.md		README.md
hdbo.yml		hdbo.yml

License

eric-vader/HD-BO-Additive-Models

Folders and files

Latest commit

History

Repository files navigation

High-Dimensional Bayesian Optimization via Tree-Structured Additive Models (AAAI 2021)

Acknowledgements

Repository Structure

Setup

Running

About

Resources

License

Stars

Watchers

Forks

Languages