Esempi in Python per TfidfVectorizer._stop_words_id

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: sklearn.feature_extraction.text

Classe/tipologia: TfidfVectorizer

Metodo/funzione: _stop_words_id

Esempi su hotexamples.com: 1

TfidfVectorizer._stop_words_id in Python: 1 esempio trovato. Questo è il miglior esempio reale in Python per sklearn.feature_extraction.text.TfidfVectorizer._stop_words_id, estratto da progetti open source. Lo puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

fit(30)

get_stop_words(30)

TfidfVectorizer(30)

fit_transform(30)

get_feature_names(30)

inverse_transform(30)

build_analyzer(30)

build_tokenizer(29)

get_params(29)

get_feature_names_out(14)

__init__(12)

idf_(11)

build_preprocessor(8)

max_features(8)

_validate_vocabulary(3)

max_df(3)

fir(2)

N_(2)

fit_on_texts(2)

build_vocab(2)

decode(2)

_tfidf(2)

decode_error(1)

append(1)

_document_frequency(1)

_get_param_names(1)

kneighbors(1)

join(1)

_stop_words_id(1)

inv_vocabulary_(1)

input(1)

infer_vector(1)

idx_target_cache(1)

get_word_net_feature_vecs(1)

bert(1)

get_shape(1)

encode(1)

get_feautre_names(1)

cate_set(1)

get_feature_name(1)

fit_transfrorm(1)

fit_transfrom(1)

count(1)

fit_trainsform(1)

count_args(1)

count_chunks(1)

encoding(1)

mean(1)

Esempio n. 1

Mostra file

def main():
    np.random.seed(args.seed)

    print("Reading params.yaml...")
    params = yaml.safe_load(open("params.yaml"))["train"][args.model]

    print("Reading training set...")
    with open(args.sentences_file, "r") as f:
        corpus = f.readlines()

    out_dir = Path(args.output_dir)
    os.makedirs(out_dir, exist_ok=True)

    if args.model == "tf_idf":
        model = TfidfVectorizer(**params["init_kwargs"])
        print("Training model...")
        model.fit(corpus)
        # hack: https://github.com/scikit-learn/scikit-learn/issues/18669
        model.vocabulary_ = OrderedDict(
            sorted(model.vocabulary_.items(), key=lambda kv: kv[1]))
        model._stop_words_id = 0
        print("Saving model to disk...")
        with (out_dir / "model.pkl").open("wb") as f:
            pickle.dump(model, f)
    elif args.model == "count":
        model = CountVectorizer(**params["init_kwargs"])
        print("Training model...")
        model.fit(corpus)
        # hack: https://github.com/scikit-learn/scikit-learn/issues/18669
        model.vocabulary_ = OrderedDict(
            sorted(model.vocabulary_.items(), key=lambda kv: kv[1]))
        model._stop_words_id = 0
        print("Saving model to disk...")
        with (out_dir / "model.pkl").open("wb") as f:
            pickle.dump(model, f)
    else:
        raise ValueError(f"Training not available for model {args.model}!")

    print("Training completed!")