A pipeline to characterize and cluster plant NBS-LRR (NB-LRR) genes, phylogenetic analysis and visualization
This package is dependent on a list of open source packages:
- Required packages
- Perl5
- Python 2.7
- HMMER v3.1
- ClustalO v1.1.0
- GNU Parallel
- MUSCLE
- PhyML
- Most of these packages are available on MSI and can be checked and loaded by:
module show perl/python2/hmmer/clustalo/parallel/muscle/phyml
module load perl/python2/hmmer/clustalo/parallel/muscle/phyml
- Required perl modules
- Bioperl
- Data::Table
- List::MoreUtils
- Time::HiRes
- Data::Dumper The BioPerl module has been installed on MSI and can be loaded by:
module load bioperl
Note that the $PERL5LIB environment variable needs to be set to include the source folder:
export export PERL5LIB=$PATH_TO_rgeneclust:$PERL5LIB
where $PATH_TO_rgeneclust is the absolute path of the source directory
- Required python packages:
Both python packages can be installed by pip:
pip install pyfasta
pip install numpy
By default pip will install packages into /usr/local, but you will probably need to install into user directories:
pip install --user pyfasta
usage: rosar.py [-h] [--cpu NCPU] cfgfile outdir
Identify, cluster and characterize plant NBS-LRR genes
positional arguments:
cfgfile config file (a text file with species identifier followed by the
absolute path of CDS fasta in each line)
outdir output directory
optional arguments:
-h, --help show this help message and exit
--cpu NCPU number processors to use (default: all/24)
Config file (test.csv): A text file with species identifier followed by the absolute path of CDS fasta in each line, for example:
Fv,/home/zhoup/test/fvesca_v1.0_genemark_hybrid.fna
Md,/home/zhoup/test/Malus_x_domestica.v1.0.consensus_CDS.fa
Pp,/home/zhoup/test/Prunus_persica_v1.0_CDS.fa
with paths of the grape, apple and pear CDS sequences.