Chanjo is coverage analysis for clinical sequencing. It's implemented in Python with a command line interface that adheres to UNIX pipeline philisophy.
Hey - exiting things are coming to the new version of Chanjo 😄
The primary change is Sambamba integration. Just run sambamba depth region
and load the output into Chanjo for further data exploration. Chanjo is now more flexible, accurate, and much easier to install. We have also built in some basic commands to quickly extract statistics from the database right from the command line.
Chanjo is distruibuted through "pip". Install the latest stable release by running:
$ pip install chanjo
... or locally for development:
$ ansible-galaxy install robinandeer.miniconda
$ git clone https://github.com/robinandeer/chanjo.git && cd chanjo
$ vagrant up
$ vagrant ssh
Chanjo exposes a composable command line interface with a nifty config file implementation.
$ chanjo init --setup
$ chanjo load /path/to/sambamba.output.bed
$ chanjo calculate mean sample1
{"metrics": {"completeness_10": 90.92, "mean_coverage": 193.85}, "sample_id": "sample1"}
Read the Docs is hosting the official documentation.
I can specifically recommend the fully interactive demo, complete with sample data to get you started right away.
If you are looking to learn more about handling sequence coverage data in clinical sequencing, feel free to download and skim through my own Master's thesis and article references.
Chanjo leverages Sambamba to annotate coverage and completeness for a general BED-file. The output can then easily to loaded into a SQL database that enables investigation of coverage across regions and samples. The database also works as an API to downstream tools like the Chanjo Coverage Report generator.
Chanjo is not the right choice if you care about coverage for every base across the entire genome. Detailed histograms is something BEDTools already handles with confidence.
MIT. See the LICENSE file for more details.
Anyone can help make this project better - read CONTRIBUTION to get started!