Python Vocabulary.add_wordの例

プログラミング言語: Python

名前空間/パッケージ名: utils.vocabulary

クラス/型: Vocabulary

メソッド/関数: add_word

hotexamples.comのコード掲載数: 2

Python Vocabulary.add_word - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのutils.vocabulary.Vocabulary.add_wordの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Vocabulary(30)

load(18)

save(14)

build(10)

process_sentence(7)

load_vocabulary(3)

new(3)

size(2)

add_word(2)

add_words(2)

build_vocabulary_from_tokens(2)

compute_frequency(2)

fromlist(2)

load_glove_vocabulary(1)

merge_vocabularies(1)

save_counts(1)

observe_word(1)

setup_corpus_vocabulary(1)

ix2sent_drop_pad(1)

sent2ix(1)

sent2ix_andpad(1)

save_vocab(1)

get_word(1)

index(1)

get_char_vocab(1)

add(1)

add_token(1)

build_from_scratch(1)

construct_embedding_matrix(1)

freeze(1)

from_serializable(1)

get_index(1)

has_word(1)

get_language(1)

get_pad(1)

get_sentence(1)

get_unk(1)

abstract2sents(1)

get_word_vocab(1)

type_to_id(1)

コード例 #1

ファイルを表示

ファイル: coco_dataset.py プロジェクト: xiaoyu5301/pytorch-vision-language

    def build_vocab(cls, json, tokenized_captions, threshold):
        print("Building vocabulary")
        coco = COCO(json)
        counter = Counter()
        ids = coco.anns.keys()
        for i, id in enumerate(ids):
            """
            caption = str(coco.anns[id]['caption'])
            tokens = CocoDataset.tokenize(caption)
            """
            tokens = tokenized_captions[id]
            counter.update(tokens)

        # If the word frequency is less than 'threshold', then the word is discarded.
        words = [word for word, cnt in counter.items() if cnt >= threshold]

        # Creates a vocab wrapper and add some special tokens.
        vocab = Vocabulary()

        # Adds the words to the vocabulary.
        for word in words:
            vocab.add_word(word)

        print("Total vocabulary size: %d" % len(vocab))
        return vocab

コード例 #2

ファイルを表示

ファイル: build_vocab.py プロジェクト: yuchen-ren/image_captioning

def build_vocab(json, threshold):
    """Build a simple vocabulary wrapper."""
    coco = COCO(json)
    counter = Counter()
    ids = coco.anns.keys()
    for i, id in enumerate(ids):
        caption = str(coco.anns[id]['caption'])
        tokens = nltk.tokenize.word_tokenize(caption.lower())
        counter.update(tokens)

        if i % 1000 == 0:
            print("[%d/%d] Tokenized the captions." % (i, len(ids)))

    # If the word frequency is less than 'threshold', then the word is discarded.
    words = [word for word, cnt in counter.items() if cnt >= threshold]

    # Creates a vocab wrapper and add some special tokens.
    vocab = Vocabulary()
    vocab.add_word('<pad>')
    vocab.add_word('<start>')
    vocab.add_word('<end>')
    vocab.add_word('<unk>')

    # Adds the words to the vocabulary.
    for i, word in enumerate(words):
        vocab.add_word(word)
    return vocab