A set of example problems examining different encoding methods for categorical variables for the purpose of classification. Optionally, install the library of encoders as a package and use them in your projects directly. They are all available as methods or as scikit-learn compatible transformers.
- Ordinal
- One-Hot
- Binary
- Helmert Contrast
- Sum Contrast
- Polynomial Contrast
- Backward Difference Contrast
- Simple Hashing
Either run the exampels in encoding_examples.py, or install as:
pip install git+https://github.com/wdm0006/categorical_encoding.git
The datasets used in these examples are car, mushroom, and splice datasets from the UCI dataset repository, found here:
BSD