TwentyOne (21)*

*Note 21 is still in alpha, there will be many changes in the next few months

Introduction

We all know "42" is the answer to the ultimate question in the universe. 42 is the dream solution we all want to achieve. Today we are halfway there. TwentyOne is the engine which takes data and creates models and answers questions to many different kinds of problems. 21 is a generic tool which can be used in multiple use cases.

Why use 21?

Solution = AI Expertise + Data -> 42 + Data ~ 21 + Data

Focus of 21 is to get an effective model quickly and reliably and it doesn't try to get state of the art model (though we can get best in class results). It is meant for non-data scientists to build models for their use cases.

21 is designed to leverage transfer learning as much as possible. For many problems the data requirement for 21 is minimal. This saves a lot of time, effort and cost in data collection. Model training is also greatly reduced.

It can also be used in situations where data is private and can't be shared for training. We can use APi to trigger 21 to start the ML training remotely. The trained model can be made available for inference using right set of TBs. This helps in using the intelligence and results derived from private data and maintaining the privacy at the same time.

Though 21 tries to be an auto ML engine, it can be used as an augmented ML engine which can help data scientist to quickly develop models. This can bring best of both worlds leading to "Real Intelligence = Artificial Intelligence + Human Intelligence"

Advantages

Greatly improves development time (days instead of months)
Needs less amount of data (transfer learning and data augmentation)
Cost for development is minimal (mainly compute cost, rest is reduced)
Builds robust models (effectively searches "more space" to get best model)
Learns best practices and uses them for similar use cases (leverages task-pipeline relation)
Data Security and Privacy (remote training and inference using TBs)
Works on most of the common use cases (contiguously adding new use cases)

Drawbacks

Needs a lot of compute
Can't be used for new kind of problems or very complex problems

How to use?

The configuration is through config.yaml file. Look at the test folder to see how to use 21.

Main concepts

Task: task includes the problem type (classification, regression, seq2seq), the pointer to the data and evaluation metric to be used to build a model.
Data: data holds the raw content and the meta information about the type of data (text, images, tabular etc.) and its characteristics (size, target, names, how to process etc).
Model: is either a machine learning or time series or deep learning model which is needed to learn the relation in the data.
Model Universe: is a collection of models, its hyper parameters and the tasks to which it has to be considered.

Blocks

Top level architecture of the engine

How does it work?

Read the docs to understand how it works. The architecture in the docs provide top level view of how things work. API documentation gives details of how to use 21 engine.

Additional Resources

Maintained By

This repo is currently maintained by shivaramkrs and the ML team at Curl Tech. We welcome your contribution.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
config		config
data		data
docs		docs
imgs		imgs
inputs		inputs
models		models
outputs		outputs
src		src
tasks		tasks
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Todo.md		Todo.md
energy_test.xlsx		energy_test.xlsx
requirements.txt		requirements.txt

License

pooja-bs-3003/twentyone

Folders and files

Latest commit

History

Repository files navigation

TwentyOne (21)*

Introduction

Why use 21?

Advantages

Drawbacks

How to use?

Main concepts

Blocks

Top level architecture of the engine

How does it work?

Additional Resources

Maintained By

License

About

Resources

License

Stars

Watchers

Forks

Languages