Skip to content

computation of some statistics used to evaluate a genome assembly.

Notifications You must be signed in to change notification settings

fw1121/Evaluating-genome-assembly

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Evaluating-genome-assembly

computation of some statistics used to evaluate a genome assembly.

Dependencies: numpy matplotlib

Usage:

    import sys
    from matplotlib import pyplot
    from stats import AssemblyStatistics
    
    # the input contig file, in FASTA format. 
    inputFile = sys.argv[1]
    
    
    out = AssemblyStatistics(inputFile)
    
    # L50 of the assembly
    l50 = out.L50()
    
    # N50 of the assembly
    n50 = out.N50()
    
    # size of the largest contig
    largestContig = out.maxContigLength()
    
    # size of the samallest contig
    smallestContig = out.minContigLength()
    
    # mean contig size 
    meanContig = out.meanContigLength()
    
    # genome coverage
    coverage = out.assemblyCoverage()
    
    # histogram of the contig lengths
    out.histogramOfContigLengths()
    
    # box plot of contig lengths
    out.boxplot()

About

computation of some statistics used to evaluate a genome assembly.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%