ancient DNA pipeline.

Processes ancient DNA data that was generated using NGS.

This pipeline employs a best practices approach for taking the data from Raw FASTQ files and creates a number of useful output files.

Installation

Run ./install.sh but before running make sure the INSTALL_DIR variable in this script points to a directory that is located on your path.

You will need to change the hardcoded path in rscripts/coverage_script.R

to point to your version of the file rscripts/coverage_plot.R

Python

Python dependencies can be installed by running.

pip install -r requirements.txt

This will install pysam, PyVCF, pyfasta, and Biopython.

mapDamage

Navigate to src/mapDamage/ and follow all installation instructions at http://ginolhac.github.io/mapDamage/.

SeqMagick

Navigate to src/seqmagick/ and run python setup.py install

AdapterRemoval

Navigate to src/AdapterRemoval/ and run the following commands.

tar xvzf AdapterRemoval-1.5.4.tar.gz
make

Then ensure that the executable AdapterRemoval is on your path.

External dependencies.

Unix tools realpath tool and zcat

All the following executables must be installed, and accesible from your path.

Muscle (http://www.drive5.com/muscle/)
R programming language (https://www.r-project.org/) with the packages
- ggplot2, Biostrings, getopt.
bwa (http://bio-bwa.sourceforge.net/)
samtools (http://samtools.github.io/)
bcftools (https://samtools.github.io/bcftools/)
parallel (https://www.gnu.org/software/parallel/)

Example Run of the pipeline.

Navigate into the tests/test_data/ directory and run the following commands.

# generate pipeline_setup.txt 
create_pipeline_setup.sh . raw
# run pipeline for human mtDNA. make sure you replace the path to the reference file. 
ancient_pipeline.sh -C "gi|251831106|ref|NC_012920.1|" \
-r ~/Programming/OpenSource/MyGitHub/ancient_dna_pipeline/ref/contamination.fa  \ 
-S "human"  -P 1

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
bash_scripts		bash_scripts
bin		bin
humans		humans
python_scripts		python_scripts
ref		ref
results		results
rscripts		rscripts
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
ancient_dna_funcs.sh		ancient_dna_funcs.sh
ancient_pipeline.sh		ancient_pipeline.sh
assert.sh		assert.sh
install.sh		install.sh
requirments.txt		requirments.txt

License

theboocock/ancient_dna_pipeline

Folders and files

Latest commit

History

Repository files navigation

ancient DNA pipeline.

Installation

Python

mapDamage

SeqMagick

AdapterRemoval

External dependencies.

Example Run of the pipeline.

About

Resources

License

Stars

Watchers

Forks

Languages