Skip to content

a sandbox for studying different methods of coding categorical variables

License

Notifications You must be signed in to change notification settings

codeaudit/categorical_encoding

 
 

Repository files navigation

Categorical Encoding Methods

A set of example problems examining different encoding methods for categorical variables for the purpose of classification. Optionally, install the library of encoders as a package and use them in your projects directly. They are all available as methods or as scikit-learn compatible transformers.

Encoding Methods

  • Ordinal
  • One-Hot
  • Binary
  • Helmert Contrast
  • Sum Contrast
  • Polynomial Contrast
  • Backward Difference Contrast
  • Simple Hashing

Usage

Either run the exampels in encoding_examples.py, or install as:

pip install git+https://github.com/wdm0006/categorical_encoding.git

Datasets

The datasets used in these examples are car, mushroom, and splice datasets from the UCI dataset repository, found here:

datasets

License

BSD

About

a sandbox for studying different methods of coding categorical variables

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%