Python TfidfVectorizer.mean Examples

Programming Language: Python

Namespace/Package Name: sklearn.feature_extraction.text

Class/Type: TfidfVectorizer

Method/Function: mean

Examples at hotexamples.com: 1

Python TfidfVectorizer.mean - 1 examples found. These are the top rated real world Python examples of sklearn.feature_extraction.text.TfidfVectorizer.mean extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

fit(30)

get_stop_words(30)

TfidfVectorizer(30)

fit_transform(30)

get_feature_names(30)

inverse_transform(30)

build_analyzer(30)

build_tokenizer(29)

get_params(29)

get_feature_names_out(14)

__init__(12)

idf_(11)

build_preprocessor(8)

max_features(8)

_validate_vocabulary(3)

max_df(3)

fir(2)

N_(2)

fit_on_texts(2)

build_vocab(2)

decode(2)

_tfidf(2)

decode_error(1)

append(1)

_document_frequency(1)

_get_param_names(1)

kneighbors(1)

join(1)

_stop_words_id(1)

inv_vocabulary_(1)

input(1)

infer_vector(1)

idx_target_cache(1)

get_word_net_feature_vecs(1)

bert(1)

get_shape(1)

encode(1)

get_feautre_names(1)

cate_set(1)

get_feature_name(1)

fit_transfrorm(1)

fit_transfrom(1)

count(1)

fit_trainsform(1)

count_args(1)

count_chunks(1)

encoding(1)

mean(1)

Example #1

Show file

File: main.py Project: phottovy/song_lyric_classifier

def tfidf_counts(lyrics=clean_lyrics, n_samples=20, max_feats=MXF):
    '''
    Returns the nth highest weighted words in the dataset
    '''
    tfidf_weights = TfidfVectorizer(
        max_features=max_feats).fit_transform(lyrics)
    weights = np.asarray(tfidf_weights.mean(axis=0)).ravel().tolist()
    weights_df = pd.DataFrame({
        'word':
        TfidfVectorizer(
            max_features=max_feats).fit(lyrics).get_feature_names(),
        'weight':
        weights
    })
    sort_df = weights_df.sort_values(by='weight',
                                     ascending=False).reset_index(drop=True)
    sort_df.index = np.arange(1, len(weights_df) + 1)
    print(
        tabulate(sort_df.head(n_samples),
                 headers=weights_df.columns,
                 tablefmt='pipe'))
    return sort_df.head(n_samples)