Skip to content

yanwu2014/genotyping-matrices

Repository files navigation

genotyping-matrices

Scripts and python modules for genotyping cells from perturbation screens followed by scRNA-seq. Currently compatible with the 10X genomics Chromium V2 chemistry.

Required input:

  • Genotype barcodes amplified from 10X genomics scRNA-seq cDNA: Read1 will be the cell/molecule barcode and Read2 should have the genotype barcode. Read2 needs to be in bam format. You can use PicardTools FastqToBam to convert the Fastq to bam, or align Read2 to the known genotype scaffold as an initial filtering step.
  • Cell Barcodes File generated by 10X Genomics CellRanger as part of the sparse matrix (should be in outs/GRCh38/barcodes.tsv)
  • Text file mapping barcodes to genotypes. These genotypes can be genes, CRISPR gRNAs, or any other genotype (mutations, etc.)

Usage (Replace filenames in brackets with the names of your files):

  1. Tag Read2 genotype reads with cell and molecule barcodes using convert_gRNA.py: python convert_gRNA.py [read1_cell_molecule_barcodes].fastq [read2_genotype_barcodes].bam
  2. Parse the cell barcodes with parseCellBarcodes.py. python parseCellBarcodes.py [/path_to_10x_sparse_matrix]/barcodes.tsv [read2_genotype_barcodes].tagged.bam [barcode_length (integer)] [barcode_upstream_sequence] [barcode_downstream_sequence]
  3. Plot UMI fraction vs Read fraction distribution to set appropriate cutoffs to filter chimeric reads. python [read2_genotype_barcodes]_cell_barcodes.pickle [min_reads_per_umi]
  4. Generate the genotype to cell dictionaries. min_umi_fraction and min_read_fraction are the cutoffs for the fraction of total UMIs and fraction of total reads, respectively, that a genotype barcode must have to be included in the genotypes dictionary. These cutoffs are to filter chimeric reads. A good range for the ORF overexpression barcodes tends to be from 0.1 to 0.25. Setting these cutoffs too high will eliminate dual and triple perturbations. python genotypeCells.py [genotype_dictionary].csv [read2_genotype_barcodes]_cell_barcodes.pickle [min_umi_fraction] [min_read_fraction] [min_reads_per_umi]

Releases

No releases published

Packages

No packages published