rapvis: a tool for RNAseq processing and visualization

Dependency

Required python version:

python >= 3.6

Sevral external software were depended for rapvis:

trimmomatic
STAR
hisat2
stringtie
bwa
samtools
featureCounts

Mandatory

pandas >= 1.1.2
numpy
matplotlib
seaborn
GSEApy
rpy2

Installation

Installing from github

# Clone remote repository
$ git clone https://github.com/liuwell/rapvis.git
  
# Install required python pacakge
$ cd rapvis
$ pip install -r requirements.txt
  
# Add execution path
# The path of current dir can get by shell command "pwd"
$ echo "export PATH=$PATH:current_dir/rapvis" >> ~/.bashrc
$ source ~/.bashrc

# Then you can type -h option to check whether the installation is successful,  
# If the output as follows, it means your installation is successful
$ rapvis_run.py -h

usage: rapvis_run.py [-h] -i INPUT [-o OUTPUT] [-p THREADS] [-lib path]
                     [-m {STAR,hisat2}] [-a ADAPTER] [-minlen N] [-trim5 N]
                     [--counts] [--rRNA] [-v]

A tool for RNAseq processing and visualization

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        the input data
  -o OUTPUT, --output OUTPUT
                        output directory (default: processed_data)
  -p THREADS, --threads THREADS
                        number of threads (CPUs) to use (default: 5)
  -lib path, --libraryPath path
                        choose reference species for mapping and annotaion
  -m {STAR,hisat2}, --mapper {STAR,hisat2}
                        choose the mapping program (default: STAR)
  -a ADAPTER, --adapter ADAPTER
                        choose illumina adaptor (default: universal), choices
                        {universal, nextera, pAAAAA}
  -minlen N             discard reads shorter than N (default: 35)
  -trim5 N              remove N bases from the begining of each read
                        (default:0)
  --counts              Get gene counts
  --rRNA                whether mapping to rRNA(Human)
  -v, --version         show program's version number and exit

Build genome index

You can download genome sequence and annotations GTF file from GENCODE. Strongly recommended for mouse and human (files marked with PRI) : https://www.gencodegenes.org/.

Other species can download from ENSEMBL, such as Zebrafish,
genome sequences: ftp://ftp.ensembl.org/pub/release-101/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.primary_assembly.fa.gz
GTF file: ftp://ftp.ensembl.org/pub/release-101/gtf/danio_rerio/Danio_rerio.GRCz11.101.gtf.gz

rapvis support STAR and hisat2 for mapping.

1. build STAR index

$ rapvis_build.py -mapper STAR -genome GRCh38.primary_assembly.genome.fa.gz -gtf gencode.v35.primary_assembly.annotation.gtf.gz

2. build hisat2 index

$ rapvis_build.py -mapper hisat2 -genome GRCh38.primary_assembly.genome.fa.gz -gtf gencode.v35.primary_assembly.annotation.gtf.gz

Usage

1. Run in local

$ rapvis_run.py -i tests/data1/ -o TestsResult -p 5 -lib STAR_index -m STAR

2. Submit the tasks to cluster

$ rapvis_submit.py -i tests/data1/ -o TestsResult -lib STAR_index -m STAR -p 5 -t 2

3. Caculated differently expressed genes

rapvis can caculated different expressed genes, based on R limma:

$ rapvis_DE.py -i input_TPM.txt -wt 0:3 -ko 3:6 -p output:

We can perform gene ontology enrichment analysis by -go aption, and the -s also needed for determining species:

$ rapvis_DE.py -i input_TPM.txt -wt 0:3 -ko 3:6 -p output -go -s Human

If the input gene matrix not be normalized, we can use -norm option to normalize, it based on limma voom:

$ rapvis_DE.py -i input_counts.txt -wt 0:3 -ko 3:6 -p output -norm

4. The Correlation coefficient between samples

We can get the correlation coeffcient heatmap of gene expresstion between samples:

$ rapvis_corr.py -i input_gene_TPM.txt

Output

Several files included in the output directory:

merge_gene_TPM.txt
the gene expression profiles for all samples, normalized by TPM
merge_qc_percent.pdf
a barplot of quality contrl details by trimmomatic
merge_mapping_percent.pdf
a barplot of the mapping details in each sample
merge_gene_TPM_species_type.pdf
a stat of detected gene species in each sample, group by gene type
merge_gene_TPM_species_EI.pdf
a stat of detected gene species in each sample, group by expression interval
merge_gene_TPM_density.pdf
a density plot for gene expression distribution

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
library		library
rapvis		rapvis
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

library

library

rapvis

rapvis

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

rapvis: a tool for RNAseq processing and visualization

Dependency

Mandatory

Installation

Installing from github

Build genome index

1. build STAR index

2. build hisat2 index

Usage

1. Run in local

2. Submit the tasks to cluster

3. Caculated differently expressed genes

4. The Correlation coefficient between samples

Output

About

Releases

Packages

Languages

License

liuwell/rapvis

Folders and files

Latest commit

History

Repository files navigation

rapvis: a tool for RNAseq processing and visualization

Dependency

Mandatory

Installation

Installing from github

Build genome index

1. build STAR index

2. build hisat2 index

Usage

1. Run in local

2. Submit the tasks to cluster

3. Caculated differently expressed genes

4. The Correlation coefficient between samples

Output

About

Topics

Resources

License

Stars

Watchers

Forks

Languages