GitHub - Rinoahu/pangenome: pangenome methods for large-scale sequencing data

Introduction

This is a graph-based method to find the frequency across species. This method includes several steps:

Build a dBG
Build the reduced dBG by removing nodes with <= 1 indegree and outdegree in dBG.
Convert sequence to compressed path according to 2.
Remove weak edges in rdBG and index the connect components in rdBG.
label each sequence again.

Requirement

Make sure that you have the following installed

Python (3.7 or greater) with numpy (1.17 or greater), scipy (1.4 or greater), and numba (0.48 or greater) packages. Here, we strongly recommend using Anaconda which has all the required packages installed.
MCL, a Markov Clustering algorithm.

Download

$git clone https://github.com/Rinoahu/pangenome

Usage

$python pangenome/kmer_pypy.py -m -i input.fasta -k 27 > result.tab


-i: genome sequences in fasta format.

-k: the kmer size. Currently, the kmer size is limited to 27, we will remove the limitation in the future.

Result

The result is a tab-seperated file.

The 1st column is the sequence identifier.

The 2-4 columns are the start, end, and strand of the conversed region.

The 5th column is the index of the conserved region.

For example:
Chr1       0       3250    +       340
Chr1       3250    6851    +       41
Chr1       6851    7420    +       18344
Chr1       7420    7661    +       25920
Chr1       7661    7811    +       36243
Chr1       7811    8015    +       15344
Chr1       8015    8071    +       16029
Chr1       8071    8105    +       35682
Chr1       8105    9779    +       49500
Chr1       9779    9806    +       7184

Citation

To cite our work, please refer to:

xxx

Name		Name	Last commit message	Last commit date
Latest commit History 227 Commits
backup		backup
deprecate		deprecate
other		other
test		test
LICENSE		LICENSE
README.md		README.md
kmer_numba.py		kmer_numba.py
upload.sh		upload.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backup

backup

deprecate

deprecate

other

other

test

test

LICENSE

LICENSE

README.md

README.md

kmer_numba.py

kmer_numba.py

upload.sh

upload.sh

Repository files navigation

Introduction

Requirement

Download

Usage

Result

Citation

About

Releases

Packages

Languages

License

Rinoahu/pangenome

Folders and files

Latest commit

History

Repository files navigation

Introduction

Requirement

Download

Usage

Result

Citation

About

Resources

License

Stars

Watchers

Forks

Languages