Skip to content

genomewalker/metadamage

Repository files navigation

Metadamage - Ancient Damage Estimation for Metagenomics

PyPI PyPI - License


Work in progress. Please contact christianmichelsen@gmail.com for further information.


Personal recommendations for this project:

This project requires a recent Python version (>=3.8) installed. See more below.

Installation:

With Poetry:

$ poetry add metadamage

or, if you prefer regular pip:

$ pip install metadamage

Metadamage CLI:

For help on the CLI interface, metadamage provides a help function:

$ metadamage --help

Example for fitting a single file using metadamage:

$ metadamage fit --max-fits 10 --max-cores 2 ./data/input/data_ancient.txt

metadamage also allows to fit multiple files:

$ metadamage fit --max-fits 10 --max-cores 2 ./data/input/*.txt

Dashboard:

To make use of the new, interactive dashboard introduced in version 0.4, run the following command (after having fitted the files):

$ metadamage dashboard

And then open a browser and go to 127.0.0.1:8050 (if it did not open automatically). For more information, use:

$ metadamage dashboard --help

Dashboard on a server:

If you are running metadamage on a server and want to use the dashboard locally, you can setup a SSH tunnel. First, on the server, run metadamage dashboard with the relevant options (and keep it open, e.g. with TMUX). Afterwards, on your local machine, run:

$ ssh -L 8050:127.0.0.1:8050 -N user@remote

Now you can open a browser and go to http://0.0.0.0:8050.

In case you're connecting through a jump host, you can use the the -J option:

$ ssh -L 8050:127.0.0.1:8050 -N -J user@jumphost user@remote

For an easier method, you can setup your ssh config (usually at ~/.ssh/config) in the following way:

Host jumphost
    User your-jumphost-username-here
    HostName your-jumphost-address-here

Host remote

Host dashboard
    Port 22
    LocalForward 8050 localhost:8050
    RemoteCommand echo "Connecting to dashboard ... CTRL+C to terminate"; sleep infinity

Host remote dashboard
    ProxyJump jumphost
    User your-remote-username-here
    HostName your-remote-address-here

Now if you simply run the following on your own computer (in a new terminal session):

$ ssh dashboard

you can open open a browser and go to http://0.0.0.0:8050.

Metadamage CLI fit options:

The metadamage fit CLI has the following options.

  • Output directory

    • --out_dir: The directory in which the fit results are stored. Default location is ./data/out. Do not change unless you known what you are doing.
  • Maximum values

    • --max-fits: Maximum number of fits to do. Default is None, i.e. fit everything.
    • --max-cores: Maximum number of cores to use while fitting. Default is 1.
    • --max-position: Maximum position in the sequence to include. Default is +/- 15 (forward/reverse).
  • Minimum values or cuts/thresholds for plots

    • --min-alignments: Minimum number of alignments (N_\mathrm{alignments} ) of a single TaxID to be fitted. Default is 10.
    • --min-y-sum: Minimum sum of y of a single TaxID to be fitted. Here y might be e.g. the number of A→T transitions in the forward direction and the number of G→A transitions in the reverse direction. In that case, it would be: \mathtt{y} = \sum_{z=1}^{15} \left( N_{\mathrm{CT}}(z)  +  N_{\mathrm{GA}}(-z) \right). Default is 10.
  • Other:

    • --substitution-bases-forward: Which substitution to check for damage in the forward region. Do not change this value except for control checks. Default is CT.
    • --substitution-bases-reverse: Which substitution to check for damage in the reverse region. Do not change this value except for control checks. Default is GA.
  • Boolean Flags

    • --forced: Force redo everything (count data and fits).
    • --version: Print the current version of the program and exit.

Setup Local Python Environment:

Make sure you have a local Python environment. Personally, I recommend using Pyenv for installing Python versions and Pyenv-Virtualenv for easy managing of virtuel environments. See e.g. this for easy installation of both.

Make sure you have a decent Python version (>=3.8) installed:

$ pyenv install 3.8.7

Now we set up a virtual environment, such that changes you do in this environment does not affect your other Python projects:

$ pyenv virtualenv 3.8.7 metadamage38
$ pyenv activate metadamage38

We now use Poetry to setup a new project which uses metadamage. Follow the interactive guide:

$ poetry new metadamage-folder
$ cd metadamage-folder

Instead of activating the environment manually after every new login, we can tell pyenv to remember it for us:

$ pyenv local metadamage38

We now have a working local, virtual Python environment where the packages are managed by Poetry, so we can now add metadamamage to our project:

$ poetry add metadamage

At this point you should log out of your terminal and log in again for reloading everything. Now if you just write:

$ metadamage

you should see the following:

$ metadamage
Usage: metadamage [OPTIONS] FILENAMES...
Try 'metadamage --help' for help.

Error: Missing argument 'FILENAMES...'.

which shows that it is working and installed. You can now use metadamage --help for more help (together with the variable explanations above).


Update:

With Poetry:

$ poetry update metadamage

or, if you prefer regular pip:

$ pip install metadamage --upgrade

Development Branch:

You can also use a newer version directly from Github:

$ poetry add git+https://github.com/ChristianMichelsen/metadamage.git

or a specific branch (named BRANCH):

$ poetry add git+https://github.com/ChristianMichelsen/metadamage.git#BRANCH

Conda:

If you prefer using Conda, you can also install metadamage (via pip). First create a folder:

$ mkdir metadamage-conda
$ cd metadamage-conda

To install metadamage:

$ wget https://raw.githubusercontent.com/ChristianMichelsen/metadamage/main/environment.yaml
$ conda env create -f environment.yaml

To update it to a new, released version of metadamage:

$ wget https://raw.githubusercontent.com/ChristianMichelsen/metadamage/main/environment.yaml
$ conda env update --file environment.yaml

Finally remember to activate the environment:

$ conda activate metadamage

About

Metagenomics Ancient Damage: metadamage

Resources

License

Stars

Watchers

Forks

Packages

No packages published