- Code
- Issues
- License
Mozilla Public License Version 2.0; See LICENSE
- Contributors
See AUTHORS.rst
Run:
$ pip install spicedham
Run:
# Get the code
$ git clone https://github.com/mozilla/spicedham
# Create a virtualenvironment
$ virtualenv ../venv
$ source ../venv/bin/activate
# Install dependencies
$ pip install -r requirements.txt
# Install the code
$ pip install -e .
Run:
$ nosetests
See docs/intsallation.rst
.
The API for spicedham is simple. There are three steps you need to do to classify spam:
Instantiate a SpicedHam object:
from spicedham import SpicedHam spicedham = SpicedHam()
Optionally you can pass dictionary of configuration values like so:
config = { 'backend': 'SqlAlchemyWrapper', # The class name of your backend 'engine': 'sqlite:///:memory:', # Needed by SqlAlchemyWrapper 'tokenizer': 'SplitTokenizer', # The class name of your tokenizer } spicedham = SpicecHam(config)
Train on data. The arguments are:
- A string message which can be split up by your chosen tokenizer.
- A boolean indicating whether classifiers should match the message
spicedham.train('I love Firefox!', False) spicedham.train('SPAMMY NONSENSE AND HATE SPEECH!', True)
Classify some data.
chance_matched
is a probability that the message was what you're searching for and will be between 0 and 1 (inclusive).chance_matched = spicedham.classify('maybe I'm spam or maybe not')