Python preliminary_filtering 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: denovoFilter.preliminary_filtering

메소드/함수: preliminary_filtering

hotexamples.com에서의 예제들: 3

Python preliminary_filtering - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 denovoFilter.preliminary_filtering.preliminary_filtering에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: test_preliminary_filtering.py 프로젝트: tianyunwang/denovoFilter

 def test_preliminary_filtering(self):
     ''' test that preliminary_filtering works correctly.
     '''
     
     # for a clean table, all variants should pass
     status = preliminary_filtering(self.variants)
     self.assertTrue(all(status == Series([True, True])))
     
     # when we define a set of samples that fail, their variants will not pass
     status = preliminary_filtering(self.variants, sample_fails=['b'])
     self.assertTrue(all(status == Series([True, False])))
     
     # when we adjust the MAF threshold, candidates above will faik
     status = preliminary_filtering(self.variants, maf_cutoff=0.0001)
     self.assertTrue(all(status == Series([False, True])))
     
     # if we set a parent to have been called, that site will fail
     self.variants['in_father_vcf'] = [1, 0]
     status = preliminary_filtering(self.variants)
     self.assertTrue(all(status == Series([False, True])))

예제 #2

파일 보기

파일: screen_candidates.py 프로젝트: tianyunwang/denovoFilter

def screen_candidates(de_novos_path,
                      fails_path,
                      filter_function,
                      maf=0.01,
                      fix_symbols=True,
                      annotate_only=False,
                      build='grch37'):
    """ load and optionally filter candidate de novo mutations.
    
    Args:
        de_novos_path: path to table of unfiltered canddiate DNMs
        fails_path: path to file listing samples which failed QC, and therefore
            all of their candidates need to be excluded.
        filter_function: function for filtering the candidates, either
            filter_denovogear_sites(), or filter_missing_indels().
        maf: MAF threshold for filtering. This is 0.01 for denovogear sites,
            and 0 for the missing indels.
        fix_symbols: whether to annotate HGNC symbols for candidates
            missing these.
        annotate_only: whether to include a column indicating pass status, rather
            than excluding all candidates which fail the filtering.
        build: whether to use the 'grch37' or 'grch38' build to get
            missing symbols.
    
    Returns:
        pandas DataFrame of candidate de novo mutations.
    """

    if de_novos_path is None:
        return None

    # load the datasets
    de_novos = load_candidates(de_novos_path)
    sample_fails = []
    if fails_path is not None:
        sample_fails = [x.strip() for x in open(fails_path)]

    # run some initial screening
    status = preliminary_filtering(de_novos, sample_fails, maf_cutoff=maf)
    segdup = check_segdups(de_novos)

    if fix_symbols:
        de_novos['symbol'] = fix_missing_gene_symbols(de_novos, build)

    pass_status = filter_function(de_novos, status & segdup) & status & segdup

    if annotate_only:
        de_novos['pass'] = pass_status
    else:
        de_novos = de_novos[pass_status]

    return standardise_columns(de_novos)

예제 #3

파일 보기

파일: screen_candidates.py 프로젝트: jeremymcrae/denovoFilter

def screen_candidates(de_novos_path, fails_path, filter_function, maf=0.01,
        fix_symbols=True, annotate_only=False):
    """ load and optionally filter candidate de novo mutations.
    
    Args:
        de_novos_path: path to table of unfiltered canddiate DNMs
        fails_path: path to file listing samples which failed QC, and therefore
            all of their candidates need to be excluded.
        filter_function: function for filtering the candidates, either
            filter_denovogear_sites(), or filter_missing_indels().
        maf: MAF threshold for filtering. This is 0.01 for denovogear sites,
            and 0 for the missing indels.
        fix_symbols: whether to annotate HGNC symbols for candidates
            missing these.
        annotate_only: whether to include a column indicating pass status, rather
            than excluding all candidates which fail the filtering.
    
    Returns:
        pandas DataFrame of candidate de novo mutations.
    """
    
    if de_novos_path is None:
        return None
    
    # load the datasets
    de_novos = load_candidates(de_novos_path)
    sample_fails = []
    if fails_path is not None:
        sample_fails = [ x.strip() for x in open(fails_path) ]
    
    # run some initial screening
    status = preliminary_filtering(de_novos, sample_fails, maf_cutoff=maf)
    segdup = check_segdups(de_novos)
    
    if fix_symbols:
        de_novos['symbol'] = fix_missing_gene_symbols(de_novos)
    
    pass_status = filter_function(de_novos, status & segdup) & status & segdup
    
    if annotate_only:
        de_novos['pass'] = pass_status
    else:
        de_novos = de_novos[pass_status]
    
    return standardise_columns(de_novos)