Ejemplos de Vocabulary.has_word en Python

Lenguaje de programación: Python

Namespace/Package Name: fastNLP.core.vocabulary

Clase / Tipo: Vocabulary

Método / Función: has_word

Ejemplos en hotexamples.com: 4

Python Vocabulary.has_word - 4 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de fastNLP.core.vocabulary.Vocabulary.has_word extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

Vocabulary(30)

update(23)

build_vocab(7)

from_dataset(6)

add_word_lst(5)

to_index(5)

has_word(4)

index_dataset(4)

add(3)

add_word(3)

build_reverse_vocab(2)

to_word(2)

word2idx(2)

word_count(2)

_is_word_no_create_entry(1)

unknown_label(1)

Ejemplo n.º 1

Mostrar archivo

 def test_contains(self):
     vocab = Vocabulary(need_default=True, max_size=None, min_freq=None)
     vocab.update(text)
     self.assertTrue(text[-1] in vocab)
     self.assertFalse("~!@#" in vocab)
     self.assertEqual(text[-1] in vocab, vocab.has_word(text[-1]))
     self.assertEqual("~!@#" in vocab, vocab.has_word("~!@#"))

Ejemplo n.º 2

Mostrar archivo

 def test_contains(self):
     vocab = Vocabulary(max_size=None,
                        min_freq=None,
                        unknown=None,
                        padding=None)
     vocab.update(text)
     self.assertTrue(text[-1] in vocab)
     self.assertFalse("~!@#" in vocab)
     self.assertEqual(text[-1] in vocab, vocab.has_word(text[-1]))
     self.assertEqual("~!@#" in vocab, vocab.has_word("~!@#"))

Ejemplo n.º 3

Mostrar archivo

Archivo: embed_loader.py Proyecto: WANGPeisheng1997/fastNLP

    def load_embedding(emb_dim, emb_file, emb_type, vocab, emb_pkl):
        """Load the pre-trained embedding and combine with the given dictionary.

        :param emb_dim: int, the dimension of the embedding. Should be the same as pre-trained embedding.
        :param emb_file: str, the pre-trained embedding file path.
        :param emb_type: str, the pre-trained embedding format, support glove now
        :param vocab: Vocabulary, a mapping from word to index, can be provided by user or built from pre-trained embedding
        :param emb_pkl: str, the embedding pickle file.
        :return embedding_tensor: Tensor of shape (len(word_dict), emb_dim)
                vocab: input vocab or vocab built by pre-train
        TODO: fragile code
        """
        # If the embedding pickle exists, load it and return.
        # if os.path.exists(emb_pkl):
        #     with open(emb_pkl, "rb") as f:
        #         embedding_tensor, vocab = _pickle.load(f)
        #     return embedding_tensor, vocab
        # Otherwise, load the pre-trained embedding.
        pretrain = EmbedLoader._load_pretrain(emb_file, emb_type)
        if vocab is None:
            # build vocabulary from pre-trained embedding
            vocab = Vocabulary()
            for w in pretrain.keys():
                vocab.update(w)
        embedding_tensor = torch.randn(len(vocab), emb_dim)
        for w, v in pretrain.items():
            if len(v.shape) > 1 or emb_dim != v.shape[0]:
                raise ValueError('pretrian embedding dim is {}, dismatching required {}'.format(v.shape, (emb_dim,)))
            if vocab.has_word(w):
                embedding_tensor[vocab[w]] = v

        # save and return the result
        # with open(emb_pkl, "wb") as f:
        #     _pickle.dump((embedding_tensor, vocab), f)
        return embedding_tensor, vocab

Ejemplo n.º 4

Mostrar archivo

Archivo: embed_loader.py Proyecto: zhangxt/fastNLP

    def load_embedding(emb_dim, emb_file, emb_type, vocab):
        """Load the pre-trained embedding and combine with the given dictionary.

        :param int emb_dim: the dimension of the embedding. Should be the same as pre-trained embedding.
        :param str emb_file: the pre-trained embedding file path.
        :param str emb_type: the pre-trained embedding format, support glove now
        :param Vocabulary vocab: a mapping from word to index, can be provided by user or built from pre-trained embedding
        :return embedding_tensor: Tensor of shape (len(word_dict), emb_dim)
                vocab: input vocab or vocab built by pre-train

        """
        pretrain = EmbedLoader._load_pretrain(emb_file, emb_type)
        if vocab is None:
            # build vocabulary from pre-trained embedding
            vocab = Vocabulary()
            for w in pretrain.keys():
                vocab.add(w)
        embedding_tensor = torch.randn(len(vocab), emb_dim)
        for w, v in pretrain.items():
            if len(v.shape) > 1 or emb_dim != v.shape[0]:
                raise ValueError(
                    "Pretrained embedding dim is {}. Dimension dismatched. Required {}".format(v.shape, (emb_dim,)))
            if vocab.has_word(w):
                embedding_tensor[vocab[w]] = v
        return embedding_tensor, vocab