Python get_most_predictive_feature_set 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: select_features

메소드/함수: get_most_predictive_feature_set

hotexamples.com에서의 예제들: 2

Python get_most_predictive_feature_set - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 select_features.get_most_predictive_feature_set에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: feature_explorer.py 프로젝트: invinciblejha/kaggle

def find_best_features(year, features, sex, age, heavy):
    """year=-1 => both years 2,3 """
    print 'find_best_features(year=%d,features=%s,sex=%s,age=%s,heavy=%s)' % (year, features, sex,
        age, heavy)
    X, y, keys = getXy_by_features(year, features, sex, age)
    title = 'features=%s,sex=%s,age=%s,year=%d' % (features,sex,age,year) 
    results, n_samples = select_features.get_most_predictive_feature_set(title, X, y, keys, heavy)
    return results, n_samples, keys

예제 #2

파일 보기

파일: pcg.py 프로젝트: invinciblejha/kaggle

def find_best_features(year, features, sex):
    import select_features
    print 'find_best_features(year=%d)' % year
    if features == 'pcg':
        X,y,keys = getXy_pcg(year)
    elif features == 'patient':
        X,y,keys = getXy_patient(year)  
               
    elif features == 'all':
        X,y,keys = getXy_all(year)
    
    print 'keys=%s' % keys

    if sex and sex.lower()[0] in 'mf' and 'Sex' in keys:
        # Get male or female population
        sex_key = keys.index('Sex')
        if sex.lower()[0] == 'm':
            p = X[:,sex_key] < 0.5
        else:    
            p = X[:,sex_key] > 0.5

        X = X[p,:]
        y = y[p]

        
    # Remove columns with low counts
    LOW_COUNT_THRESHOLD = 100
    Xtot = X.sum(axis=0)
    significant = Xtot >= LOW_COUNT_THRESHOLD
    # Remove sex too
    significant[sex_key] = False
    print 'Removing keys < %d: %s' % (LOW_COUNT_THRESHOLD,
        [keys[i] for i in range(len(keys)) if not significant[i]])
    print 'keys=%d X=%s => ' % (len(keys), X.shape),    
    keys = [keys[i] for i in range(len(keys)) if significant[i]]
    X = X[:,significant] 
    print 'keys=%d X=%s' % (len(keys), X.shape) 
    
    # Normalize
    means = X.mean(axis=0)
    stds = X.std(axis=0)

    for i in range(X.shape[1]):
        X[:,i] = X[:,i] - means[i]
        if abs(stds[i]) > 1e-6:
            X[:,i] = X[:,i]/stds[i]    
    
    return select_features.get_most_predictive_feature_set(X, y, keys), keys