Skip to content

bliebeskind/PhyloPreprocessing

Repository files navigation

PhyloPreprocessing

Collection of little scripts for wrangling files before a phylogenetic analysis.

Script Useful for you? Description
aln_to_subtree yes remove taxa from a tree to match those in an alignment.
domain_chop no divide 4 domain protein into constituent domains. Used in Liebeskind et al. 2013
ensembl_tsv_ids no for parsing tab delimited files downloaded from Ensembl after an ortholog search
entrez_xml_mods no functions for dealing with entrez xml. Most are broken
fas_to_nex yes convert fasta file to nexus
fas_to_phy yes convert fasta file to phylip (good for PAML)
hmmalign_parser no parse a stockholm alignment output by hmmer
keepers_subAlnTree yes get a subalignment and tree with an text file of taxa you wish to retain
longest_cds yes for getting longest transcripts from files of Ensembl genes with multiple transcripts
open_reading_fram yes get longest open reading frames for each sequence in a fasta file (good for NGS data but only looks at forward frame, so it works for RNAseq)
phylip_map maybe replace long description lines with short names, and keep track of changes
remove_redundant yes remove redundant sequences from a sequence file
stockholm_pps no explore posterior probabilities on stockholm alignment such as those put out by hmmer
TPC_chop no divide 2 domain protein into constituent domains. Used in Liebeskind et al. 2013

About

Various scripts for processing files before phylogenetic analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages