automatminer

An automatic "black-box" yet interpretable prediction engine for materials properties.

Current code coverage:

What is it?

Automatminer is a tool for automatically creating machine learning pipelines. Automatminer's pipelines include automatic featurization with matminer, feature reduction, and AutoML backend handling. Put in a materials dataset, get out a machine that predicts materials properties.

What can it do?

Automatminer can make pipelines to accurately predict the properties from many kinds of materials data:

both computational and experimental data
small (~100 samples) to moderate (~100,000 samples) sized datasets
crystalline datasets
composition-only (i.e., unknown phases) datasets
automatminer is agnostic to the target property, meaning it can be used to predict electronic, mechanical, thermodynamic, or any other kind of property

Automatminer automatically decorates a dataset using hundreds of descriptor techniques from matminer's descriptor library, picks the most useful features for learning, and runs a separate AutoML pipeline using TPOT. Once a pipeline has been fit, it can be examined with skater's interpretability tools, summarized in a text file, saved to disk, or used to make new predictions.

How do I use it?

The easiest (and most automatic) way to use automatminer is through the MatPipe object. First, fit the MatPipe to a dataframe containing materials objects such as chemical compositions (or pymatgen Structures) and some material target property.

from automatminer.pipeline import MatPipe

# Fit a pipeline to training data to predict band gap
pipe = MatPipe()
pipe.fit(train_df, "band gap")

Now use your pipeline to predict the properties of some other data, such as a new composition or structure.

predicted_df = pipe.predict(other_df, "band gap")

You can also use it to benchmark against other machine learning models with the benchmark method of MatPipe, which optimizes the pipeline a training data and returns predictions on a held test set.

pipe = MatPipe()
test_predictions = pipe.benchmark(df, "bulk modulus", test_spec=0.2)

Once a MatPipe has been fit, you can examine it internally to see how it works using pipe.digest(); or pickle it for later with pipe.save().

How do I cite automatminer?

We are in the process of writing a paper for automatminer. In the meantime, please use the citation given in the matminer repo.

Contributing

Interested in contributing? See our contribution guidelines and make a pull request! Please submit questions, issues / bug reports, and all other communication through the matminer Google Group.

Name		Name	Last commit message	Last commit date
Latest commit History 949 Commits
.circleci		.circleci
automatminer		automatminer
examples		examples
.coveragerc		.coveragerc
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.circleci

.circleci

automatminer

automatminer

examples

examples

.coveragerc

.coveragerc

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

automatminer

Current code coverage:

What is it?

What can it do?

How do I use it?

How do I cite automatminer?

Contributing

About

Releases

Packages

Languages

theiman112860/automatminer

Folders and files

Latest commit

History

Repository files navigation

automatminer

Current code coverage:

What is it?

What can it do?

How do I use it?

How do I cite automatminer?

Contributing

About

Resources

Stars

Watchers

Forks

Languages