PhyloPreprocessing

Collection of little scripts for wrangling files before a phylogenetic analysis.

Script	Useful for you?	Description
aln_to_subtree	yes	remove taxa from a tree to match those in an alignment.
domain_chop	no	divide 4 domain protein into constituent domains. Used in Liebeskind et al. 2013
ensembl_tsv_ids	no	for parsing tab delimited files downloaded from Ensembl after an ortholog search
entrez_xml_mods	no	functions for dealing with entrez xml. Most are broken
fas_to_nex	yes	convert fasta file to nexus
fas_to_phy	yes	convert fasta file to phylip (good for PAML)
hmmalign_parser	no	parse a stockholm alignment output by hmmer
keepers_subAlnTree	yes	get a subalignment and tree with an text file of taxa you wish to retain
longest_cds	yes	for getting longest transcripts from files of Ensembl genes with multiple transcripts
open_reading_fram	yes	get longest open reading frames for each sequence in a fasta file (good for NGS data but only looks at forward frame, so it works for RNAseq)
phylip_map	maybe	replace long description lines with short names, and keep track of changes
remove_redundant	yes	remove redundant sequences from a sequence file
stockholm_pps	no	explore posterior probabilities on stockholm alignment such as those put out by hmmer
TPC_chop	no	divide 2 domain protein into constituent domains. Used in Liebeskind et al. 2013

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
TPC_chop.py		TPC_chop.py
aln_to_subtree.py		aln_to_subtree.py
domain_chop.py		domain_chop.py
ensembl_tsv_ids.py		ensembl_tsv_ids.py
entrez_xml_mods.py		entrez_xml_mods.py
entrez_xml_table.py		entrez_xml_table.py
fas_to_nex.py		fas_to_nex.py
fas_to_phy.py		fas_to_phy.py
hmmalign_parser.py		hmmalign_parser.py
hmmscan_results.py		hmmscan_results.py
keepers_to_subAlnTree.py		keepers_to_subAlnTree.py
longest_cds.py		longest_cds.py
open_reading_frame.py		open_reading_frame.py
phylip_map.py		phylip_map.py
remove_redundant.py		remove_redundant.py
remove_splices.py		remove_splices.py
stockholm_pps.py		stockholm_pps.py

bliebeskind/PhyloPreprocessing