Exemplos de build_vocab_from_text_file em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: torchtext.experimental.vocab

Método / Função: build_vocab_from_text_file

Exemplos em hotexamples.com: 2

build_vocab_from_text_file em Python - 2 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de torchtext.experimental.vocab.build_vocab_from_text_file em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Exemplo n.º 1

0

Exibir arquivo

Arquivo: benchmark_experimental_vocab.py Projeto: pjamil21/text

def benchmark_experimental_vocab_construction(vocab_file_path, is_raw_text=True, is_legacy=True, num_iters=1): f = open(vocab_file_path, 'r') t0 = time.monotonic() if is_raw_text: if is_legacy: print("Loading from raw text file with legacy python function") for _ in range(num_iters): legacy_vocab_from_file_object(f) print("Construction time:", time.monotonic() - t0) else: print( "Loading from raw text file with basic_english_normalize tokenizer" ) for _ in range(num_iters): tokenizer = basic_english_normalize() jited_tokenizer = torch.jit.script(tokenizer) build_vocab_from_text_file(f, jited_tokenizer, num_cpus=1) print("Construction time:", time.monotonic() - t0) else: for _ in range(num_iters): load_vocab_from_file(f) print("Construction time:", time.monotonic() - t0)

Exemplo n.º 2

0

Exibir arquivo

Arquivo: test_with_asset.py Projeto: zkneupper/text

def test_vocab_from_raw_text_file(self): asset_name = 'vocab_raw_text_test.txt' asset_path = get_asset_path(asset_name) tokenizer = basic_english_normalize() jit_tokenizer = torch.jit.script(tokenizer) v = build_vocab_from_text_file(asset_path, jit_tokenizer, unk_token='<new_unk>') expected_itos = ['<new_unk>', "'", 'after', 'talks', '.', 'are', 'at', 'disappointed', 'fears', 'federal', 'firm', 'for', 'mogul', 'n', 'newall', 'parent', 'pension', 'representing', 'say', 'stricken', 't', 'they', 'turner', 'unions', 'with', 'workers'] expected_stoi = {x: index for index, x in enumerate(expected_itos)} self.assertEqual(v.get_itos(), expected_itos) self.assertEqual(dict(v.get_stoi()), expected_stoi)