alexandria

This is a high-level machine learning framework that allows for the users to easily run multiple types of machine learning experiments at the drop of a hat. I'm currently working on developing this project, along with the wiki pages further.

Build

To build from source (which is currently the only way to build this), use the Makefile:

$ make

This will call the setup.py script and will attempt to install the package onto your system. If you find any issues, please create one and I'll get on to it. I haven't done these sorts of things before, so bugs are expected.

Examples

Basic Classification

A basic example for the API is below:

# examples/demo.py - DataBunch and DataFrame demonstrations
# Data preprocessing
from sklearn.datasets import load_iris, load_diabetes

from alexandria.experiment import Experiment

if __name__ == '__main__':
	# Data preprocessing
	iris = load_iris()

	experiment = Experiment(
		name='Cross Validation Example #1',
		dataset=iris,
		xlabels='data',
		ylabels='target',
		models=['rf', 'dt', 'knn', 'nb']
	)
	experiment.trainCV(nfolds=10, metrics=['accuracy', 'rec', 'prec', 'auc'])
	experiment.summarizeMetrics()

Output:

name                   Accuracy       Recall         Precision      AUC
---------------------  -------------  -------------  -------------  -------------
sklearn.random forest  0.9600±0.0442  0.9600±0.0442  0.9644±0.0418  0.9907±0.0147
sklearn.decision tree  0.9600±0.0442  0.9600±0.0442  0.9644±0.0418  0.9700±0.0332
sklearn.k neighbors    0.9667±0.0447  0.9667±0.0447  0.9738±0.0339  0.9873±0.0222
sklearn.naive bayes.Gaussian  0.9533±0.0427  0.9533±0.0427  0.9627±0.0325  0.9947±0.0088

Basic Regression with Pandas DataFrame

	# Data preprocessing for dataframe object
	diabetes_df = load_diabetes(as_frame=True).frame
	data_cols = diabetes_df.columns[:-1] # All columns, but the last one is the target
	target_col = diabetes_df.columns[-1] # 'target'

	experiment = Experiment(
		name='Cross Validation Example #2',
		dataset=diabetes_df,
		xlabels=data_cols,
		ylabels=target_col,
		models=['rf', 'dt', 'knn']
	)
	experiment.trainCV(nfolds=10, metrics='r2')
	experiment.summarizeMetrics()

Output:

Cross Validation Example #2
name                   R2
---------------------  --------------
sklearn.random forest  0.3963±0.1006
sklearn.decision tree  -0.2044±0.2989
sklearn.k neighbors    0.3329±0.1247

Naive Bayes Flavors Comparison

Code:

# Let's run all of the Naive Bayes models and compare their performance
	models = {
		'sklearn': [
			{
				'model': 'nb',
				'flavor': 'bernoulli'
			},
			{
				'model': 'nb',
				'flavor': 'Categorical'
			},
			{
				'model': 'nb',
				'flavor': 'complement'
			},
			{
				'model': 'nb',
				'flavor': 'gaussian'
			},
			{
				'model': 'nb',
				'flavor': 'multi'
			}
		]
	}
	experiment = Experiment(
		name='Naive Bayes Experiment',
		dataset=iris,
		xlabels='data',
		ylabels='target',
		modellibdict=models
	)
	experiment.trainCV(nfolds=10, metrics=['acc', 'rec', 'prec', 'auc'])
	experiment.summarizeMetrics()

Output:

Naive Bayes Experiment
name                             Accuracy       Recall         Precision      AUC
-------------------------------  -------------  -------------  -------------  -------------
sklearn.naive bayes.Bernoulli    0.3333±0.0000  0.3333±0.0000  0.1111±0.0000  0.5000±0.0000
sklearn.naive bayes.Categorical  0.9267±0.0629  0.9267±0.0629  0.9355±0.0595  0.9847±0.0179
sklearn.naive bayes.Complement   0.6667±0.0000  0.6667±0.0000  0.4926±0.0148  0.9780±0.0181
sklearn.naive bayes.Gaussian     0.9533±0.0427  0.9533±0.0427  0.9627±0.0325  0.9947±0.0088
sklearn.naive bayes.Multinomial  0.9533±0.0670  0.9533±0.0670  0.9599±0.0608  0.9860±0.0256

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
alexandria		alexandria
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alexandria

alexandria

.gitignore

.gitignore

LICENSE

LICENSE

MANIFEST.in

MANIFEST.in

Makefile

Makefile

README.md

README.md

setup.py

setup.py

Repository files navigation

alexandria

Build

Examples

Basic Classification

Basic Regression with Pandas DataFrame

Naive Bayes Flavors Comparison

About

Releases

Packages

Languages

License

ibrahim85/alexandria

Folders and files

Latest commit

History

Repository files navigation

alexandria

Build

Examples

Basic Classification

Basic Regression with Pandas DataFrame

Naive Bayes Flavors Comparison

About

Resources

License

Stars

Watchers

Forks

Languages