Skip to content

kheath/htrans

Repository files navigation

Horizontal Gene Transfer project

Collaborative project for Eliot Bush

Goals:

  • Partially reconstruct gene family lineages in Escherichia Coli
  • Identify horizontal gene transfer events

superalignv2.py & phylibalign.py

These two python scripts are very similar. They take a file with blocks of genes (.afa) and convert it to fasta format.

fasta2phylip.sh

This is a shell script that converts fasta files to phylip format.

mrca.py

mrca(tree, family) returns the most recent common ancestor for a given gene family.

testATree

The correct phylogenetic tree for our 5 sample species. I typed this out by hand based on a tree generated by RAxML, there was no script directly involved in making this.

processFamGenes

This is a collection of scripts put together to generate the needed information to run dupDel. It's a good idea to run this once and keep the files around. It is important to note that the input files are made once with other scripts I haven't looked into.

Input:

  • fam.out (silix results)
  • geneSpeciesMap.txt
  • dbList.txt (file with sample species)
  • geneOrder.txt

Output:

  • famGenes.txt
  • famInfoResult.txt
  • adjacencyInfo.txt

Sample command: python processFamGenes.py -f fam.out -m geneSpeciesMap.txt -d dbList.txt -g geneOrder.txt

dupDel

This script calculates the minimum cost and associated duplications/deletions for every gene family.

Input:

  • testATree
  • famInfoResult.txt

Output:

  • dupDelAll.txt

Sample command: python dupDel.py -t testATree -f famInfoResult.txt -d 3 -c 5 -n 1

htrans.py

This is a wrapper for the main pipeline. Make sure you've done all the preprocessing steps before running this (Preprocessing not included in this repository).

Example: python htrans.py -f siLiX_families -m gene<->species_map -d list_of_species_of_interest -g gene_order -t phylogenetic_tree -b deletion_cost -c duplication_cost -s #_of_species -o full_species_list

Authors & Contributors

Kevin Heath & Zunyan Wang

Support or Contact

Email me at kevin.n.heath@gmail.com

About

Horizontal Gene Transfer project

Resources

Stars

Watchers

Forks

Packages

No packages published