Python CountVectorizer 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: cornac.data.text

클래스/타입: CountVectorizer

hotexamples.com에서의 예제들: 4

Python CountVectorizer - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 cornac.data.text.CountVectorizer에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

CountVectorizer(4)

fit(3)

transform(2)

binary(1)

fit_transform(1)

vocab(1)

예제 #1

파일 보기

    def test_arguments(self):
        try:
            CountVectorizer(max_doc_freq=-1)
        except ValueError:
            assert True

        try:
            CountVectorizer(max_features=-1)
        except ValueError:
            assert True

예제 #2

파일 보기

    def test_transform(self):
        vectorizer = CountVectorizer(max_doc_freq=2,
                                     min_freq=1,
                                     max_features=1)
        vectorizer.fit(self.docs)
        sequences, X = vectorizer.transform(self.docs)
        npt.assert_array_equal(X.A, np.asarray([[0], [2], [0]]))

        vectorizer.binary = True
        _, X1 = vectorizer.fit_transform(self.docs)
        _, X2 = vectorizer.transform(self.docs)
        npt.assert_array_equal(X1.A, X2.A)

예제 #3

파일 보기

    def test_with_special_tokens(self):
        vectorizer = CountVectorizer(max_doc_freq=2,
                                     min_freq=1,
                                     max_features=1)
        vectorizer.fit(self.docs)

        new_vocab = Vocabulary(vectorizer.vocab.idx2tok,
                               use_special_tokens=True)
        vectorizer.vocab = new_vocab

        sequences, X = vectorizer.transform(self.docs)
        npt.assert_array_equal(X.A, np.asarray([[0], [2], [0]]))

예제 #4

파일 보기

 def test_bad_freq_arguments(self):
     vectorizer = CountVectorizer(max_doc_freq=2, min_freq=3)
     try:
         vectorizer.fit(self.docs)
     except ValueError:
         assert True