Python HashingVectorizer.toarray 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sklearn.feature_extraction.text

클래스/타입: HashingVectorizer

메소드/함수: toarray

hotexamples.com에서의 예제들: 5

Python HashingVectorizer.toarray - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sklearn.feature_extraction.text.HashingVectorizer.toarray에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

HashingVectorizer(30)

transform(30)

fit(30)

fit_transform(30)

get_feature_names(9)

toarray(5)

build_analyzer(3)

get_stop_words(2)

__init__(2)

build_tokenizer(2)

todense(1)

get_feature_names_out(1)

get_params(1)

get_glove_vectors(1)

get_features_name(1)

__dict__(1)

build_preprocessor(1)

transformat(1)

예제 #1

0

파일 보기

파일: test_text_feature_extraction.py 프로젝트: st071300/cuML

def test_hashingvectorizer_norm(norm):
    if norm not in ["l1", "l2", None]:
        with pytest.raises(ValueError):
            res = HashingVectorizer(norm=norm).fit_transform(DOCS_GPU)
    else:
        res = HashingVectorizer(norm=norm).fit_transform(DOCS_GPU)
        ref = SkHashVect(norm=norm).fit_transform(DOCS)
        assert_almost_equal_hash_matrices(res.todense().get(), ref.toarray())

예제 #2

0

파일 보기

파일: test_text_feature_extraction.py 프로젝트: st071300/cuML

def test_hashingvectorizer_lowercase(lowercase):
    corpus = [
        "This Is DoC",
        "this DoC is the second DoC.",
        "And this document is the third one.",
        "and Is this the first document?",
    ]
    res = HashingVectorizer(lowercase=lowercase).fit_transform(Series(corpus))
    ref = SkHashVect(lowercase=lowercase).fit_transform(corpus)
    assert_almost_equal_hash_matrices(res.todense().get(), ref.toarray())

예제 #3

0

파일 보기

파일: test_text_feature_extraction.py 프로젝트: st071300/cuML

def test_hashingvectorizer():
    corpus = [
        "This is the first document.",
        "This document is the second document.",
        "And this is the third one.",
        "Is this the first document?",
    ]

    res = HashingVectorizer().fit_transform(Series(corpus))
    ref = SkHashVect().fit_transform(corpus)
    assert_almost_equal_hash_matrices(res.todense().get(), ref.toarray())

예제 #4

0

파일 보기

파일: test_text_feature_extraction.py 프로젝트: st071300/cuML

def test_hashingvectorizer_delimiter():
    corpus = ["a0b0c", "a 0 b0e", "c0d0f"]
    res = HashingVectorizer(
        delimiter="0", norm=None, preprocessor=lambda s: s
    ).fit_transform(Series(corpus))
    # equivalent logic for sklearn
    ref = SkHashVect(
        tokenizer=lambda s: s.split("0"),
        norm=None,
        token_pattern=None,
        preprocessor=lambda s: s,
    ).fit_transform(corpus)
    assert_almost_equal_hash_matrices(res.todense().get(), ref.toarray())

예제 #5

0

파일 보기

파일: test_text_feature_extraction.py 프로젝트: st071300/cuML

def test_hashingvectorizer_stop_word():
    ref = SkHashVect(stop_words="english").fit_transform(DOCS)
    res = HashingVectorizer(stop_words="english").fit_transform(DOCS_GPU)
    assert_almost_equal_hash_matrices(res.todense().get(), ref.toarray())