Python FeatureVectorizer.list_datasets 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: freediscovery.engine.vectorizer

클래스/타입: FeatureVectorizer

메소드/함수: list_datasets

hotexamples.com에서의 예제들: 2

Python FeatureVectorizer.list_datasets - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 freediscovery.engine.vectorizer.FeatureVectorizer.list_datasets에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

FeatureVectorizer(27)

ingest(25)

setup(25)

_load_features(10)

delete(8)

append(2)

list_datasets(2)

parse_email_headers(2)

remove(2)

_filenames(1)

query_features(1)

예제 #1

파일 보기

def test_search_filenames(use_hashing):
    cache_dir = check_cache()

    fe = FeatureVectorizer(cache_dir=cache_dir, mode='w')
    uuid = fe.setup(use_hashing=use_hashing)
    fe.ingest(data_dir, file_pattern='.*\d.txt')

    assert fe.db_ is not None

    for low, high, step in [(0, 1, 1), (0, 4, 1), (3, 1, -1)]:
        idx_slice = list(range(low, high, step))
        filenames_slice = [fe.filenames_[idx] for idx in idx_slice]
        idx0 = fe.db_._search_filenames(filenames_slice)
        assert_equal(idx0, idx_slice)
        assert_equal(filenames_slice, fe[idx0])

    with pytest.raises(NotFound):
        fe.db_._search_filenames(['DOES_NOT_EXIST.txt'])

    if not use_hashing:
        n_top_words = 5
        terms = fe.query_features([2, 3, 5], n_top_words=n_top_words)
        assert len(terms) == n_top_words

    fe.list_datasets()

예제 #2

파일 보기

def _list(args):
    cache_dir = _parse_cache_dir(args.cache_dir)
    fe = FeatureVectorizer(cache_dir)
    res = fe.list_datasets()
    res = sorted(res, key=lambda row: row['creation_date'], reverse=True)
    for row in res:
        print(' * Processed dataset {}'.format(row['id']))
        print('    - data_dir: {}'.format(row['data_dir']))
        print('    - creation_date: {}'.format(row['creation_date']))
        for method in ['lsi', 'categorizer', 'cluster', 'dupdet', 'threading']:
            dpath = os.path.join(fe.cache_dir, row['id'], method)
            if not os.path.exists(dpath):
                continue
            mid_list = os.listdir(dpath)
            if mid_list:
                print('     # {}'.format(method))
                for mid in mid_list:
                    print('       * {}'.format(mid))
        print(' ')