Python FeatureExtractor.create_dictionariesの例

プログラミング言語: Python

名前空間/パッケージ名: app.feature_extractor

クラス/型: FeatureExtractor

メソッド/関数: create_dictionaries

hotexamples.comのコード掲載数: 2

Python FeatureExtractor.create_dictionaries - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのapp.feature_extractor.FeatureExtractor.create_dictionariesの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

FeatureExtractor(12)

generate_features(4)

create_dictionaries(2)

コード例 #1

ファイルを表示

ファイル: learning_curves_NEW.py プロジェクト: douglascook/bio_relex

def pickle_similarities():
    """
    Pickle similarities based on all records
    """
    # TODO this is kind of wrong since the similarities will change as the word features are generated per split
    records = load_records()

    # set up extractor using desired features
    extractor = FeatureExtractor(word_gap=True, count_dict=True, phrase_count=True, word_features=5)
    extractor.create_dictionaries(records, how_many=5)

    data, _ = extractor.generate_features(records)
    data = vec.fit_transform(data).toarray()
    similarities = get_similarities(data)

    pickle.dump(similarities, open('pickles/similarities_all.p', 'wb'))

コード例 #2

ファイルを表示

def build_pipeline(which, train):
    """
    Set up classfier here to avoid repetition
    """
    if which == 'bag_of_words':
        clf = Pipeline([('vectoriser', DictVectorizer()),
                        #('scaler', preprocessing.StandardScaler(with_mean=False)),
                        ('normaliser', preprocessing.Normalizer(norm='l2')),
                        ('svm', LinearSVC(dual=True, C=1))])
        # set up extractor using desired features
        extractor = FeatureExtractor(word_gap=False, count_dict=False, phrase_count=True, pos=False, combo=True,
                                     entity_type=True, word_features=False, bag_of_words=True, bigrams=True)

    elif which == 'word_features':
        clf = Pipeline([('vectoriser', DictVectorizer(sparse=False)),
                        #('scaler', preprocessing.StandardScaler(with_mean=False)),
                        ('normaliser', preprocessing.Normalizer()),
                        #('svm', SVC(kernel='poly', coef0=1, degree=2, gamma=10, C=1, cache_size=2000))])
                        #('svm', SVC(kernel='rbf', gamma=1, cache_size=1000, C=1))])
                        #('svm', SVC(kernel='linear', cache_size=1000, C=1))])
                        ('svm', LinearSVC(dual=True, C=1))])

        extractor = FeatureExtractor(word_gap=False, count_dict=False, phrase_count=True, word_features=True,
                                     combo=True, pos=True, entity_type=True, bag_of_words=False, bigrams=False)
        extractor.create_dictionaries(train, how_many=5)

    else:
        clf = Pipeline([('vectoriser', DictVectorizer(sparse=False)),
                        #('scaler', preprocessing.StandardScaler(with_mean=False)),
                        ('normaliser', preprocessing.Normalizer()),
                        #('svm', SVC(kernel='poly', coef0=1, degree=3, gamma=1, C=1, cache_size=2000))])
                        #('svm', SVC(kernel='rbf', gamma=100, cache_size=1000, C=10))])
                        #('svm', SVC(kernel='linear', cache_size=1000, C=1))])
                        ('svm', LinearSVC(dual=True, C=1))])

        extractor = FeatureExtractor(word_gap=False, count_dict=False, phrase_count=True, word_features=False,
                                     combo=True, pos=True, entity_type=True, bag_of_words=False, bigrams=False)

    return clf, extractor