Given a GTF annotation and a corresponding fasta file, prints out a file containing the processed transcripts, and another file containing a window of size 2L-2 around each splice junction of each transcript.
./isolate_transcripts.py annotation.gtf scaffolds.fasta output_directory [-L LENGTH]
annotation.gtf
Contains annotations for exons, transcripts, and genesscaffolds.fasta
Fasta file that corresponds toannotation.gtf
output_directory
The directory in which the results are to be written.[-L LENGTH]
, whereLENGTH
is the read length (default 98)
In order to generate a list of transcripts to capture using bustools capture
, you can use the following sed
command on the fasta files generated by isolate_transcripts.py
:
sed -n 's/^>//p' foo.fasta > foo.to_capture.fasta