Skip to content

afcarl/modelforge

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modelforge Build Status codecov PyPI

This project is the foundation for sharing machine learning models. It helps to maintain the registry, the remote storage where all model files are stored in a structured, cataloged way. It defines modelforge.Model, the base class for all the models which is capable of automatic fetching from the registry. It provides the abstraction over loading and saving models on disk as well.

Each model receives a UUID and carries other metadata. The underlying file format is ASDF.

Currently, only one registry storage backend is supported: Google Cloud Storage.

src-d/ml uses modelforge to make ML on source code accessible for everybody.

Install

pip3 install modelforge

Usage

The project exposes two interfaces: API and command line.

API

modelforge package contains the most important classes and functions: Model base class, merge_strings, split_strings which optimize the serialization of string lists, disassemble_sparse_matrix, assemble_sparse_matrix which handle sparse matrices. A "model" here means something which holds the data and can be (de)serialized, like in web development.

Models can be registered with modelforge.register_model()

  • this is not strictly needed, but needed for extended model dumps. Most typically, you would like to import all your model classes and register them in a single module.

It is possible to register a custom registry storage with modelforge.backends.register_backend().

Command line

python3 -m modelforge --help
  • init initializes the empty registry.
  • publish pushes the model file specified to the registry and updates the index
  • dump prints brief information about the model. Local path, URL or UUID must be specified:
modelforge dump https://storage.googleapis.com/models.cdn.sourced.tech/models/<model>/<uuid>.asdf \
    --backend "gcs" --args bucket="models.cdn.sourced.tech"
modelforge dump <uuid> --backend "gcs" --args bucket="models.cdn.sourced.tech"
modelforge dump /path/to/model
  • list lists all the models in the registry.
  • delete deletes a model, UUID must be specified.

Configuration

It is possible to specify the default backend, backend's options and the vendor. Create modelforgecfg.py anywhere in your project tree.

Docker image

docker build -t srcd/modelforge .
docker run -it --rm srcd/modelforge --help

Contributions

PEP8

We use PEP8 with line length 99 and ". All the tests must pass:

python3 -m unittest discover /path/to/modelforge

License

Apache 2.0.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.9%
  • HTML 1.1%