String-Search-Related-Bioinformatics-Programs

Programs in Python and BASH related to homology comparison and bioinformatics-related functions involving string searches. Formats used: FASTA, NEXUS, and Clustal aln. Python code requires Biopython package.

Note: almost all have dependencies in Pattern.py and Bioptyhon

download_data.py Currently: Prompts for list of gi numbers and returns file with full ncbi information for each protein. Under construction: return nucleotide seqence for each protein sequence that has associated nucleotide sequence.
getNames.py Prompts for library, outputs list of organism names (name defined as enclosed by [])
list_gi.sh Prompts for library and output file name, then will write to the output file gi numbers of all records in library. Note: this is a BASH script, so to my knowlege, only works with UNIX-like system. Also make sure to make it executable with chmod + program_name
makeLibrary_oo.py Creates library from 2 hard-coded file dependencies: FASTA file of subunit sequences (in this case CnaBs), and another file of the location of the motif in each sequence of the FASTA file (in this case the ebox)
makeSpeciesList.sh Creates list of species names from library. Under the hood: cleans up output from getNames.py Note: see #3 for note about .sh files
makeTree.py creates tree with Biopython, under construction
makePrimer.py Given FASTA file with 1 protein sequence followed by it's DNA sequence (forward, 5'-3') and given defined starts of the forward and reverse primer, will output a bunch of potential primers, their AT content, and their estimated melting temperature
trimSeq.py Used to parse proteins in large chunks of DNA sequence. Currently can use start and stop codons as delimeters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identification of Putative CnaA and CnaB Adhesion Domains and Investigating their Evolution .pdf

Identification of Putative CnaA and CnaB Adhesion Domains and Investigating their Evolution .pdf

README.md

README.md

download_data.py

download_data.py

getNames.py

getNames.py

list_gi.sh

list_gi.sh

makeLibrary_oo.py

makeLibrary_oo.py

makeSpeciesList.sh

makeSpeciesList.sh

makeTree.py

makeTree.py

makeprimer.py

makeprimer.py

trimSeq.py

trimSeq.py

Repository files navigation

String-Search-Related-Bioinformatics-Programs

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Identification of Putative CnaA and CnaB Adhesion Domains and Investigating their Evolution .pdf		Identification of Putative CnaA and CnaB Adhesion Domains and Investigating their Evolution .pdf
README.md		README.md
download_data.py		download_data.py
getNames.py		getNames.py
list_gi.sh		list_gi.sh
makeLibrary_oo.py		makeLibrary_oo.py
makeSpeciesList.sh		makeSpeciesList.sh
makeTree.py		makeTree.py
makeprimer.py		makeprimer.py
trimSeq.py		trimSeq.py

kpet123/String-Search-Related-Bioinformatics-Programs

Folders and files

Latest commit

History

Repository files navigation

String-Search-Related-Bioinformatics-Programs

About

Resources

Stars

Watchers

Forks

Languages