This repository hosts the source code and instructions to reproduce the analysis and results from our study on low-mappability regions and copy number variation.
If you just want the CNV catalog, find them in the FigShare repository, or directly here.
To review the code and resulting graphs/numbers, have a look at the R-markdown reports in the reports
folder and scripts in the src
folder.
To rerun the analysis, follow these steps:
The necessary data has been deposited on FigShare. Depending on the analysis, you might not need to download all the data.
Still, the easiest way is to download all the data and unzip it in the data
folder.
mkdir -p data
cd data
wget https://ndownloader.figshare.com/articles/2007630?private_link=8fd3007ebb0fbad09b6d -O figshare.zip
unzip figshare.zip
tar -xzvf PopSV-NA12878-lowmap.tar.gz
Many different packages are used throughout the analysis. The commands to install them are written in the installDependencies.R
.
To install all the necessary packages open R and run source("installDependencies.R")
.
The raw R-markdown reports are located in the src
folder. To recompile them simply run:
library(rmarkdown)
render("XXX.Rmd")
Of note, the annotation-download-format.Rmd
report should be compiled before anything because it downloads and format public annotations.
Or, you can compile all the reports using:
sh compileAll.sh
You can already see the reports produced by these scripts in the reports
folder.
The code was tested on fresh dockerized Ubuntu with R 3.3.1. Windows is not recommended as it doesn't support the parallel
package.