Python Vocabulary.from_pretrained_transformerの例

プログラミング言語: Python

名前空間/パッケージ名: allennlp.data.vocabulary

クラス/型: Vocabulary

メソッド/関数: from_pretrained_transformer

hotexamples.comのコード掲載数: 2

Python Vocabulary.from_pretrained_transformer - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのallennlp.data.vocabulary.Vocabulary.from_pretrained_transformerの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Vocabulary(30)

add_token_to_namespace(30)

get_vocab_size(30)

get_token_index(30)

from_files(30)

from_instances(30)

from_params(30)

get_index_to_token_vocabulary(24)

add_tokens_to_namespace(19)

get_token_from_index(13)

get_token_to_index_vocabulary(12)

save_to_files(10)

set_from_file(6)

from_dataset(5)

extend_from_instances(4)

from_pretrained_transformer_and_instances(3)

from_pretrained_transformer(2)

add_transformer_vocab(2)

_extend(1)

get_index_to_token(1)

get_namespaces(1)

extend_from_vocab(1)

get_token_to_index(1)

print_statistics(1)

_padding_token(1)

コード例 #1

ファイルを表示

ファイル: vocabulary_test.py プロジェクト: himkt/allennlp

 def _get_expected_vocab(dataset, namespace, model_name):
     vocab_from_instances = Vocabulary.from_instances(dataset)
     instance_tokens = set(
         vocab_from_instances._token_to_index[namespace].keys())
     transformer_tokens = set(
         Vocabulary.from_pretrained_transformer(
             model_name, namespace)._token_to_index[namespace].keys())
     return instance_tokens.union(transformer_tokens)

コード例 #2

ファイルを表示

    def test_from_pretrained_transformer(self, model_name):
        namespace = "tokens"
        from allennlp.common import cached_transformers

        tokenizer = cached_transformers.get_tokenizer(model_name)

        vocab = Vocabulary.from_pretrained_transformer(model_name,
                                                       namespace=namespace)
        assert vocab._token_to_index[namespace] == tokenizer.get_vocab()
        vocab.save_to_files(self.TEST_DIR / "vocab")

        vocab1 = Vocabulary.from_files(self.TEST_DIR / "vocab")
        assert vocab1._token_to_index[namespace] == tokenizer.get_vocab()