Skip to content

ghoresh11/bin_genomes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

bin_genomes

Bin and QC a collection of genome sequences

usage: python bin_genomes.py [options] <job_id> <genomes_file>

Bin input genomes according to sequence identity using specified method.

Positional arguments:

<job_id> STR Name used to describe this run

<genomes_file> FILE File with list of genome files (Default format: FASTA, set --gff for GFF)

Optional arguments:

-h, --help show this help message and exit

--gff Set if input in GFF format [Default: FASTA]

--species_cutoff FLOAT Maximum distance between species to remove contaminents [0.04]

--distance FLOAT Maximum distance between two genomes to be considered in same bin [0.005]

--max_contigs INT Skip genomes with more than num_contigs [600]

--min_length FLOAT Skip genomes shorter than this length, in MBP [4]

--max_length FLOAT Skip genomes longer than this length, in MBP [6]

--cpu INT Number of CPUs to use [16]

--keep_temp Keep temporary files

--verbose Verbose output while run

--debug Set for Debug mode (doesnt run MASH)

--method STR Method to bin the genomes. Options: (MASH), [MASH]

About

Bin and QC a collection of genome sequences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages