Calibration

"Calibration" is the process of turning e.g. machine learning predictions into probabilities. This is particularly useful whenever the predictions are used downstream to estimate expected values and similar.

There are a number of calibration methods. This repository contains a handful I have decided to implement myself both for learning to understand how and why they actually work, as well as making some comparisons.

To be added:

Isotonic regression - It turns out the version provided in sklearn has a slight bug. It is not a fatal bug as it does not affect results to a large degree, but it is still a bug.

Platt-scaling - Basically logistic regression without no bias parameter and no regularization. One situation where a separate platt-scaling function from the one integrated into most SVM libraries is if you change sampling rate. In such cases, the built-in version would give wildly misleading results. Also, this is not an efficient version of platt-scaling.

ENIR - Ensemble of Near-Isotonic Regression (Naeini & al.). The version provided by the original authors is matlab-code, hence a python-version is needed. Also, the matlab version sometimes, although rarely, produces probabilities that are outside of [0, 1]. My version does not include the priors used by the original authors, and hence produces slightly less well-calibrated probabilities. Maybe I will fix this one day. This might just be the best calibration method out there.

BBQ - Bayesian binning by quantiles. Maybe I won't implement this as it contains a rather problematic flaw: It turns out that with highly imbalanced datasets, the highest scoring samples will be assigned unreasonably low probabilities.

BEIR - Ensemble of isotonic regression with model weighting. This actually produces slightly better results than IR, but does not reach as good results as ENIR.

**** - Isotonic regression with quality guarantees. The bins in isotonic regression should actually be characterized by the beta-distribution, hence quality guarantees can be assigned to bins. Whenever the quality does not reach some predefined level (i.e. credible interval not narrow enough) a bin is merged with its neighboring bin to produce a more peaked beta-distribution and hence better quality guarantees. The code for this one is a mess for the moment, though...

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
README.md		README.md
beir_tests.py		beir_tests.py
calibration_tests.py		calibration_tests.py
enir_n.py		enir_n.py
isotonic.py		isotonic.py
isotonic_tests.py		isotonic_tests.py
logistic_regression.py		logistic_regression.py
my_enir.py		my_enir.py
my_ir.py		my_ir.py
test.py		test.py
wabir_tests.py		wabir_tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

README.md

beir_tests.py

beir_tests.py

calibration_tests.py

calibration_tests.py

enir_n.py

enir_n.py

isotonic.py

isotonic.py

isotonic_tests.py

isotonic_tests.py

logistic_regression.py

logistic_regression.py

my_enir.py

my_enir.py

my_ir.py

my_ir.py

test.py

test.py

wabir_tests.py

wabir_tests.py

Repository files navigation

Calibration

About

Releases

Packages

Languages

minghao2016/calibration-4

Folders and files

Latest commit

History

Repository files navigation

Calibration

About

Resources

Stars

Watchers

Forks

Languages