faiths-pd-benchmarking

Benchmarking speed and memory for fast calculation of Faith's PD

Installation

The installation of all packages needed for benchmarking requires conda.

conda create --yes -n faith-benchmark python=3.6 pip "numpy>=1.15" -c conda-forge
conda activate faith-benchmark
conda env update --file requirements.yml

Benchmarking

Generate data of desired size

# Make the subsets of the data
./01.01-make_data_array_all.sh

# Generate a file with commands used for benchmarking
./01.02-generate-commands-file.sh

Run Benchmarking

# Benchmark Skbio Faith's PD
./01.03-par_timeout_skbio.sh

# Benchmark SFPhD
./01.04-par_timeout_sfphd.sh

# Aggregate the results
./01.05-process_results.sh

Create benchmarking plot

After the above steps have been completed, the benchmarking plot can be recreated by running the 01.06-create-faiths-pd-benchmarking-figure.ipynb notebook.

Benchmarking large table

The benchmark/time_stacked_faith.py script can be used with the large table with the following command, if the path to table and tree are known.

python benchmark/time_stacked_faith.py <path to table> <path to tree>

Power Analysis for FINRISK

The results of the power analysis can be recreated with the notebook: 02.01-power-analysis-figure.ipynb

Phylogenetic Analysis

Faith's PD distributions on metagenomics by Age

The distributions of Faith's PD by Age group can recreated with the notebook: 03.01-plot-alpha-distributions.ipynb

Empress Visualization

The Empress visualization can be created with the notebook 03. 02-metagenomic-age-phylog-analysis.ipynb. Note that a working installation of Qiime2 with Empress is need to reproduce the visualization.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
benchmark		benchmark
commands/large-public-subsets		commands/large-public-subsets
data		data
scripts		scripts
.gitignore		.gitignore
01.01-make_data_array_all.sh		01.01-make_data_array_all.sh
01.02-generate-commands-file.sh		01.02-generate-commands-file.sh
01.03-par_timeout_skbio.sh		01.03-par_timeout_skbio.sh
01.03b-pbs-par_timeout_skbio.sh		01.03b-pbs-par_timeout_skbio.sh
01.04-par_timeout_sfphd.sh		01.04-par_timeout_sfphd.sh
01.04b-pbs-par_timeout_sfphd.sh		01.04b-pbs-par_timeout_sfphd.sh
01.05-process_results.sh		01.05-process_results.sh
01.06-create-faiths-pd-benchmarking-figure.ipynb		01.06-create-faiths-pd-benchmarking-figure.ipynb
02.01-power-analysis-figure.ipynb		02.01-power-analysis-figure.ipynb
03.01-plot-alpha-distributions.ipynb		03.01-plot-alpha-distributions.ipynb
03.02-metagenomic-age-phylo-analysis.ipynb		03.02-metagenomic-age-phylo-analysis.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.yml		requirements.yml

License

knightlab-analyses/faiths-pd-benchmarking

Folders and files

Latest commit

History

Repository files navigation

faiths-pd-benchmarking

Installation

Benchmarking

Generate data of desired size

Run Benchmarking

Create benchmarking plot

Benchmarking large table

Power Analysis for FINRISK

Phylogenetic Analysis

Faith's PD distributions on metagenomics by Age

Empress Visualization

About

Resources

License

Stars

Watchers

Forks

Languages