Skip to content

dheeraj-thedev/chanjo

 
 

Repository files navigation

Chanjo PyPI version Build Status Coverage Status

Chanjo is coverage analysis for clinical sequencing. It's implemented in Python with a command line interface that adheres to UNIX pipeline philisophy.

Installation

Chanjo is distruibuted through "pip". Install the latest release by running:

$ pip install chanjo

... or locally for development:

$ git clone https://github.com/robinandeer/chanjo.git && cd chanjo
$ pip install --editable .

Do note that Chanjo is built on some of kind-of tricky dependencies. If you are experiencing any issues, help is just a click away in the documentation.

Usage

Chanjo exposes a composable command line interface. You can always save intermediary files at any stage and customize every option. However, using a chanjo.toml config and UNIX pipes you can end up with something like:

$ chanjo convert CCDS.sorted.txt | chanjo annotate alignment.bam > coverage.bed

Chanjo Report

A shamelessly plug for a neat little Chanjo plugin; Chanjo-Report. It allows you to extract metrics from Chanjo databases and generate coverage reports as either HTML or PDF.

After you install it using pip install chanjo-report you will notice a new subcommand under the Chanjo CLI.

$ chanjo report
#sample_id	group_id	cutoff	avg. coverage	avg. completeness	diagnostic yield	gender
vavaweho	group1	10	155.64825142540616	0.9829187630212934	0.8941083089800483	female

Documentation

Read the Docs is hosting the official documentation.

I can specifically recommend the fully interactive demo, complete with sample data to get you started right away.

If you are looking to learn more about handling sequence coverage data in clinical sequencing, feel free to download and skim through my own Master's thesis and article references.

Features

What Chanjo does

Chanjo works on BAM alignment files and extracts interesting coverage related statistics. You use a BED-file to define which regions of the genome that you particularly care about. The output takes the shape of an extended BED-file.

An optional final step is to load data into a SQL database. This will aggregate data from exons to transcripts and genes. The database will later work as an API to downstream tools like the Chanjo Coverage Report generator.

What Chanjo doesn't

Chanjo is not the right choice if you care about coverage for every base across the entire genome. Detailed histograms is something BEDTools already handles with confidence.

Contributors

License

MIT. See the LICENSE file for more details.

Contributing

Anyone can help make this project better - read CONTRIBUTION to get started!

About

Chanjo provides a better way to analyze coverage data in clinical sequencing.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%