Python GLMNET

glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models.

This is a Python wrapper for the fortran library used in the R package glmnet. While the library includes linear, logistic, Cox, Poisson, and multiple-response Gaussian, only linear and logistic are implemented in this package.

The API follows the conventions of Scikit-Learn, so it is expected to work with tools from that ecosystem.

Installation

glmnet depends on numpy, scikit-learn and scipy. A working Fortran compiler is also required to build the package, for Mac users, brew install gcc will take care of this requirement.

git clone git@github.com:civisanalytics/python-glmnet.git
cd python-glmnet
python setup.py install

Alternatively, one can install the zipped project directly from GitHub.

pip3 install https://github.com/HIVERY/python-glmnet/archive/master.zip

Usage

General

By default, LogitNet and ElasticNet fit a series of models using the lasso penalty (α = 1) and up to 100 values for λ (determined by the algorithm). In addition, after computing the path of λ values, performance metrics for each value of λ are computed using 3-fold cross validation. The value of λ corresponding to the best performing model is saved as the lambda_max_ attribute and the largest value of λ such that the model performance is within cut_point * standard_error of the best scoring model is saved as the lambda_best_ attribute.

The predict and predict_proba methods accept an optional parameter lamb which is used to select which model(s) will be used to make predictions. If lamb is omitted, lambda_best_ is used.

Both models will accept dense or sparse arrays.

Regularized Logistic Regression

from glmnet import LogitNet

m = LogitNet()
m = m.fit(x, y)

Prediction is similar to Scikit-Learn:

# predict labels
p = m.predict(x)
# or probability estimates
p = m.predict_proba(x)

Regularized Linear Regression

from glmnet import ElasticNet

m = ElasticNet()
m = m.fit(x, y)

Here are some of the features available in glmnet.

import numpy as np
m = ElasticNet(alpha=0.01)  # set the alpha values
m = ElasticNet(fit_intercept=False)  # disable the intercept fit
m = ElasticNet(lambda_path=np.exp(np.linspace(-1, -4, num=20))  # set a decreasing sequence of lambda values for glmnet to use
m = ElasticNet(lower_limits=np.array([0, -np.inf, 0]), upper_limits=np.array([np.inf, 3, np.inf]))  # set limits on the coefficients
m = ElasticNet(n_splits=30)  # set the number of cross-validation splits - note that CV is enabled by default.

Once the model is defined, we can set weights for each observation in the fit method.

m.fit(X=x, y=y, sample_weights=np.array([1, 3, 1])  # the weights get standardised internally

Predict:

p = m.predict(x)

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LICENSES		LICENSES
glmnet		glmnet
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
circle.yml		circle.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

License

HIVERY/python-glmnet

Folders and files

Latest commit

History

Repository files navigation

Python GLMNET

Installation

Usage

General

Regularized Logistic Regression

Regularized Linear Regression

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages