Semi-supervised learning frameworks for Python

This project contains Python implementations for semi-supervised learning, made compatible with scikit-learn, including

Contrastive Pessimistic Likelihood Estimation (CPLE) (based on - but not equivalent to - Loog, 2015), a `safe' framework applicable for all classifiers which can yield prediction probabilities (safe here means that the model trained on both labelled and unlabelled data should not be worse than models trained only on the labelled data)
Self learning (self training), a naive semi-supervised learning framework applicable for any classifier (iteratively labelling the unlabelled instances using a trained classifier, and then re-training it on the resulting dataset - see e.g. http://pages.cs.wisc.edu/~jerryzhu/pub/sslicml07.pdf )
Semi-Supervised Support Vector Machine (S3VM) - a simple scikit-learn compatible wrapper for the QN-S3VM code developed by Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer (see http://www.fabiangieseke.de/index.php/code/qns3vm ) This method was included for comparison

The first method is a novel extension of Loog, 2015 for any discriminative classifier (the differences to the original CPLE are explained below). The last two methods are only included for comparison.

The advantages of the CPLE framework compared to other semi-supervised learning approaches include

it is a generally applicable framework (works with scikit-learn classifiers which allow per-sample weights)
it needs low memory (as opposed to e.g. Label Spreading which needs O(n^2)), and
it makes no additional assumptions except for the ones made by the choice of classifier

The main disadvantage is high computational complexity. Note: this is an early stage research project, and work in progress (it is by no means efficient or well tested)!

If you need faster results, try the Self Learning framework (which is a naive approach but much faster):

from frameworks.SelfLearning import *

any_scikitlearn_classifier = SVC()
ssmodel = SelfLearningModel(any_scikitlearn_classifier)
ssmodel.fit(X, y)

For details consult the documentation

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docs		docs
examples		examples
semisup_learn		semisup_learn
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

examples

examples

semisup_learn

semisup_learn

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

environment.yml

environment.yml

setup.py

setup.py

Repository files navigation

Semi-supervised learning frameworks for Python

About

Releases

Packages

Languages

License

terry07/semisup-learn

Folders and files

Latest commit

History

Repository files navigation

Semi-supervised learning frameworks for Python

About

Resources

License

Stars

Watchers

Forks

Languages