scikit-learn-compatible estimators from Civis Analytics
Installation with pip
is recommended:
$ pip install civisml-extensions
For development, a few additional dependencies are needed:
$ pip install -r dev-requirements.txt
This package contains scikit-learn-compatible estimators for stacking ( StackedClassifier
, StackedRegressor
), non-negative linear regression ( NonNegativeLinearRegression
), preprocessing pandas DataFrames
( DataFrameETL
), and using Hyperband for cross-validating hyperparameters ( HyperbandSearchCV
).
Usage of these estimators follows the standard sklearn conventions. Here is an example of using the StackedClassifier
:
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.ensemble import RandomForestClassifier
>>> from civismlext.stacking import StackedClassifier
>>> # Note that the final estimator 'metalr' is the meta-estimator
>>> estlist = [('rf', RandomForestClassifier()),
>>> ('lr', LogisticRegression()),
>>> ('metalr', LogisticRegression())]
>>> mysm = StackedClassifier(estlist)
>>> # Set some parameters, if you didn't set them at instantiation
>>> mysm.set_params(rf__random_state=7, lr__random_state=8,
>>> metalr__random_state=9, metalr__C=10**7)
>>> # Fit
>>> mysm.fit(Xtrain, ytrain)
>>> # Predict!
>>> ypred = mysm.predict_proba(Xtest)
See the doc strings of the various estimators for more information.
See CONTIBUTING.md
for information about contributing to this project.
BSD-3
See LICENSE.md
for details.