Skip to content

kumar-physics/eppic-precal

 
 

Repository files navigation

This folder contains scripts to do EPPIC pre-computation 

Author : Kumaran Baskaran
Date : 05-07-2013


EPPIC pre-computation is usually carried out in two parts. 

Part 1: to be done on a local machine

1. download the necessary files: DonwloadFiles
2. upload to mysql database: UploadToDatabase
3. update blastdb files: UpdateBlastDB
4. create unique fast seq: CreateUniqueFasta

all the above steps can be done manually using the corresponding scripts (or) eppic-precomp-local script will run all those jobs in sequence and transfer the necessary files to Merlin.

Hint: the best way to do it is to use the “eppic-precomp-local” script. If there is a problem in the middle, comment out the finished step and rerun “eppic-precomp-local” 

At the end of this local computing part, a folder named “eppic-precomp-yyyy-mm-dd” will be created and transferred to the home folder of Merlin (if you have passwd-free access to merlin, otherwise one has to manually transfer the folder to merlin). 

Part 2: cluster computing 

This part consists of two steps:

step 1: create blast cache files
Simply edit the paths in the eppic-precomp-merlin script and run it. This will generate a qsub script to create blast cache files.

step 2: EPPIC pre computation:
1. rsync the local pdb database to latest one
2. set up the right UniProt version in the eppic.conf file
3. go to eppic_precomp folder inside eppic-precomp-yyyy-mm-dd and run ./configure_jobs 0

Note: 0 indicates initial run
It will generate folders named input, output, qsubscripts

4. go to qsubscripts and submit the jobs one by one

5. once the first chunk is finished you can run ./configure_jobs 1 1
first number indicates the rerun count
second number indicates the chunk number

this will generate the new files for the missing entries 

similarly for the second chunk ./configure_jobs 1 2

in the rerun the amount of RAM is increased to 16G

after the first rerun if still some files are missing then carry out a second rerun ./configure_jobs 2 1
In this case one has to manually edit the RAM value in qsubscripts/eppic_chunk1run_2.sh

About

EPPIC pre-computation pipeline

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Shell 54.9%
  • Python 45.1%