Python Parsers.genBankToAminoacid 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: utils

클래스/타입: Parsers

메소드/함수: genBankToAminoacid

hotexamples.com에서의 예제들: 2

Python Parsers.genBankToAminoacid - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 utils.Parsers.genBankToAminoacid에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

parseDatasetContents(7)

genBankToAminoacid(2)

parseFastaToList(2)

debug_check(1)

genBankToNucleotide(1)

history_check(1)

income_check(1)

new_account_check(1)

normalizeSequence(1)

open_account_check(1)

parseFasta(1)

pay_check(1)

payments_check(1)

sortFasta(1)

withdraw_check(1)

예제 #1

파일 보기

    def getDomains(self, sparkContext):

        # recover the species name for using in temp files
        self.species = Utils.getSpecies(self.source_path)
        domainFinder = DomainFinder.DomainFinder()

        # load source sequences into a single list
        if ("fasta" in self.source_type):
            list, file_content = Parsers.parseFastaToList(self.source_path, "")
        elif ("genbank" in self.source_type):
            list = Parsers.genBankToAminoacid(self.source_path)

        print('Processing domains...')

        # create RDD with source sequences
        sourceRDD = sparkContext.parallelize(file_content, numSlices=2000)

        if ("nucleotide" in self.source_type):
            # execute sixFrame translation for each sequence in RDD
            sourceRDD = sourceRDD.map(lambda x: SixFrameTranslator.main(x))

        # execute Pfam domain prediction for each sixFrame translation in RDD
        domainsRDD = sourceRDD.map(lambda x: domainFinder.main(x[0], x[1]))
        processedRDD = domainsRDD.map(
            lambda x: self.processDomainOutput(x[0], x[1]))

        # recover Pfam domain prediction results from RDD
        result = processedRDD.collectAsMap()

        print('Done!')

        return result

예제 #2

파일 보기

파일: CorpusPreprocess.py 프로젝트: bioinfoUQAM/TOUCAN

    def createNegShuffle(self, posPerc):
        files = Utils.listFilesExt(self.source_path, self.ext)
        negPerc = 100 - posPerc
        positives = len(files)
        negativeSize = int((negPerc * positives) / posPerc)
        print('Negative percentage: ' + str(negPerc) + '% \n' +
              'Negative instances: ' + str(negativeSize) + '\n' +
              'Positive percentage: ' + str(posPerc) + '% \n' +
              'Positive instances: ' + str(positives) + '\n' +
              'Total corpus size: ' + str(negativeSize + positives))

        thisDecRatio = 0.0
        count = 0
        ratio = (negativeSize / positives)
        decRatio = ratio - int(ratio)

        print('Generating...')
        for file in files:
            # add up the decimal ratio part
            thisDecRatio += round(decRatio, 2)
            # reset range
            ratioRange = int(negativeSize / positives)

            # check if decimal ratio added up to a duplicate
            if (thisDecRatio >= 1):
                ratioRange = int(ratio + thisDecRatio)
                thisDecRatio = 0

            for i in range(0, ratioRange):
                name = os.path.basename(file)
                result_file = name.split('.')[0] + '_' + str(
                    i) + '.shuffled.negative.fasta'

                if ('nuc' in self.seqType):
                    content = Parsers.genBankToNucleotide(file)
                if ('amino' in self.seqType):
                    list, content = Parsers.genBankToAminoacid(file)
                content = Utils.charGramShuffle(content, 2)
                content = '>' + name + '\n' + content

                count += 1

                Utils.writeFile(self.result_path + result_file, content)

        print('Total generated: ' + str(count) + '. Done!')