Skip to content

us241098/dialogy

 
 

Repository files navigation

Dialogy

Build Status Coverage Status Codacy Badge PyPI version

Dialogy is a battries-included 🔋 opinionated framework to build machine-learning solutions for speech applications.

  • Plugin-based: Makes it easy to import/export components to other projects. 🔌
  • Stack-agnostic: No assumptions made on ML stack; your choice of machine learning library will not be affected by using Dialogy. 👍
  • Progressive: Minimal boilerplate writing to let you focus on your machine learning problems. 🤏

Documentation

Installation

pip install dialogy

Test

make test

Examples

Using dialogy to run a classifier on a new input.

import pickle
from dialogy.workflow import Workflow
from dialogy.preprocessing import merge_asr_output


def access(workflow):
    return workflow.input


def mutate(workflow, value):
    workflow.input = value


def vectorizer(workflow):
    vectorizer = TfidfVectorizer()
    workflow.input = vectorizer.transform(workflow.input)


class TfidfMLPClfWorkflow(Workflow):
    def __init__(self):
        super(TfidfMLPClfWorkflow).__init__()
        self.model = None

    def load_model(self, model_path):
        with open(model_path, "rb") as binary:
            self.model = binary.load()

    def inference(self):
        self.output = self.model.predict(self.input)


preprocessors = [merge_asr_output(access=access, mutate=mutate), vectorizer]
workflow = TfidfMLPClfWorkflow(preprocessors=preprocessors, postprocessors=[])
output = workflow.run([[{"transcript": "hello world", "confidence": 0.97}]]) # output -> _greeting_

Refer to the source for merge_asr_output and Plugins to understand this example better.

Note

  • Popular workflow sub-classes will be accepted after code-review.

FAQs

Training boilerplate

❌. This is not an end-to-end automated model training framework. That said, no one wants to write boilerplate code, unfortunately, it doesn't align with the objectives of this project. Also, it is hard to accomodate for different needs like:

  • library dependencies
  • hardware support
  • Need for visualizations/reports during/after training.

Any rigidity here would lead to distractions both to maintainers and users of this project. Plugins and custom Workflow are certainly welcome and can take care of recipe-based needs.

Common Evaluation Plans

❌. Evaluation of models is hard to standardize. if you have a fairly common need, feel free to contribute your workflow, plugins.

Benefits

  • ✅. This project offers a conduit for an untrained model. This means once a Workflow is ready you can use it anywhere: evaluation scripts, serving your models via an API, custom training/evaluation scripts, combining another workflow, etc.

  • ✅. If your application needs spoken language understanding, you should find little need to write data processing functions.

  • ✅. Little to no learning curve, if you know python you know how to use this project.

Contributions

  • Go through the docs.

  • Name your branch as "purpose/short-description". examples:

    • "feature/hippos_can_fly"
    • "fix/breathing_water_instead_of_fire"
    • "docs/chapter_on_mighty_sphinx"
    • "refactor/limbs_of_mummified_pharao"
    • "test/my_patience"
  • Make sure tests are added are passing. Run make test lint at project root. Coverage is important aim for 100% unless reviewer agrees.

About

Language understanding toolkit for human dialogs.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.6%
  • Makefile 0.4%