Skip to content

dozou/xgboost-tuner

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xgboost-tuner is a Python library for automating the tuning of XGBoost parameters.

Due to XGBoost's large number of parameters and the size of their possible parameter spaces, doing an ordinary GridSearch over all of them isn't computationally feasible.

The excellent article Complete Guide to Parameter Tuning in XGBoost offers an alternative approach to tuning XGBoost by tuning parameters incrementally.

This library offers two strategies to automate this tuning - an incremental approach as laid out in the article above and an alternative approach using a more computationally efficient randomized search.

In both strategies, the user can configure the parameter space of interest through keyword arguments.

Installing xgboost-tuner

PyPI

To install xgboost-tuner, execute

pip install xgboost-tuner  

Alternatively, you could download the package manually from the Python Package Index https://pypi.python.org/pypi/xgboost-tuner, unzip it, navigate into the package, and use the command:

python setup.py install

Examples

Tuning XGBoost parameters through an incremental grid search

from sklearn.datasets import load_svmlight_file
from xgboost_tuner.tuner import tune_xgb_params

train, label = load_svmlight_file('data/agaricus.txt.train')
train = train.toarray()

# Tune the parameters incrementally and limit the range for colsample_bytree and subsample
best_params, history = tune_xgb_params(
    cv_folds=3,
    label=label,
    metric_sklearn='accuracy',
    metric_xgb='error',
    n_jobs=4,
    objective='binary:logistic',
    random_state=2017,
    strategy='incremental',
    train=train,
    colsample_bytree_min=0.8,
    colsample_bytree_max=1.0,
    subsample_min=0.8,
    subsample_max=1.0
)

Tuning XGBoost parameters through randomized search

from sklearn.datasets import load_svmlight_file
from xgboost_tuner.tuner import tune_xgb_params

train, label = load_svmlight_file('data/agaricus.txt.train')
train = train.toarray()

# Tune the parameters in a randomized fashion and control the distributions for colsample_bytree and subsample
best_params, history = tune_xgb_params(
    cv_folds=3,
    label=label,
    metric_sklearn='accuracy',
    metric_xgb='error',
    n_jobs=4,
    objective='binary:logistic',
    random_state=2017,
    strategy='randomized',
    train=train,
    colsample_bytree_loc=0.5,
    colsample_bytree_scale=0.2,
    subsample_loc=0.5,
    subsample_scale=0.2
)

About

A library for automatically tuning XGBoost parameters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%