Skip to content
This repository has been archived by the owner on Jan 22, 2022. It is now read-only.
/ oatmeal Public archive

CLI for training multiclass and multilabel text classification models with BERT.

License

Notifications You must be signed in to change notification settings

pmbaumgartner/oatmeal

Repository files navigation

Oatmeal

BERT eating Oatmeal

I needed something to create single sentence and document classifiers using BERT.

Given a binary prediction problem with training dataset like:

$ head train.csv
texts,labels
"the sandwiches are good",1
"I found the food disappointing",0
...

Train a model with:

python oatmeal.py train binary --input-data="train.csv"

some other helpful arguments:

[--model-path MODEL_PATH] [--model-name MODEL_NAME]

by default, it'll save your model to a timestamped directory with the classification type as the model name.

Predictions

Once you've trained your model, to predict:

python oatmeal.py predict --input-data="test.csv" --model-path="your_model_path" --model-name="your_model_name"

Model Types

You can train binary, multiclass, and multilabel.

Binary

$ head binary-output.csv
id,label
0,0.2934
1,0.8934

Multiclass & Multilabel

if multiclass: sum(labels) == 1

$ head multilabel-output.csv
id,label0,label1,...,labelk
0,0.3943,0.8984,...,0.3481
1,0.8293,0.1883,...,0.9342

About

CLI for training multiclass and multilabel text classification models with BERT.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published