Python CountVectorizer.get_feature_name Examples

Programming Language: Python

Namespace/Package Name: sklearn.feature_extraction.text

Class/Type: CountVectorizer

Method/Function: get_feature_name

Examples at hotexamples.com: 1

Python CountVectorizer.get_feature_name - 1 examples found. These are the top rated real world Python examples of sklearn.feature_extraction.text.CountVectorizer.get_feature_name extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

CountVectorizer(30)

_validate_vocabulary(30)

fit_transform(30)

fit(30)

build_tokenizer(30)

build_analyzer(30)

get_stop_words(30)

get_params(21)

get_feature_names_out(15)

build_preprocessor(13)

__init__(10)

get_feature_names(9)

dictionary_freeze(6)

count(4)

analyzer(4)

fixed_vocabulary(3)

astype(3)

_count_vocab(2)

copy(2)

fit_trainsform(2)

get_features_names(2)

append(2)

_word_ngrams(2)

get_feature_name(1)

getSenVec(1)

_sort_features(1)

get_features(1)

get_sentence_vector(1)

get_shape(1)

getOutputCol(1)

fit_Transform(1)

fit_trasform(1)

fit_transfrom(1)

fit_transforn(1)

__repr__(1)

fir_transform(1)

__dict__(1)

extract_ngrams(1)

delete_temporary_training_data(1)

count_features(1)

_limit_features(1)

fir(1)

Example #1

Show file

from sklearn.feature_extraction.text import CountVectorizer
import pandas as pd

data = pd.read_csv("x.txt", sep='\t')
data.columns = ['label','body_text']


count_vect = CountVectorizer(analyzer = clearn_text) # clearn_text is a handmade function
X_counts = count_vect.fit_transform(data['body_text'])
print(X_counts.shape)
print(count_vect.get_feature_name())

X_counts_df = pd.DataFrame(X_counts_sample.toarray()) # till now we can see how many times a word appeared in a sentence


# With N-grams ---------------------------------------------------------------------------------
ngram_vect = CountVectorizer(ngram_range=(1,3))
X_counts = ngram_vect.fit_transform(data['body_text'])
print(X_counts.shape)
print(ngram_vect.get_feature_name())

X_counts_df = pd.DataFrame(X_counts_sample.toarray())
X_counts_df.columns = ngram_vect.get_feature_names()

'''
# TF-IDF ----------------------------------------------------------------------------------------
# need to learn more
1st count how many times a word appear in a sentence
2nd count how many sentence including this word too
3rd show the percentage