Python Tokenizer.convert_ids_to_tokens Examples

Programming Language: Python

Namespace/Package Name: Tokenizer

Class/Type: Tokenizer

Method/Function: convert_ids_to_tokens

Examples at hotexamples.com: 1

Python Tokenizer.convert_ids_to_tokens - 1 examples found. These are the top rated real world Python examples of Tokenizer.Tokenizer.convert_ids_to_tokens extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

Tokenizer(30)

advance(11)

getTokens(6)

has_more_tokens(6)

identifier(4)

key_word(3)

intVal(3)

getVocabSize(2)

fit_on_texts(2)

get_next_token(2)

get_value(2)

nextToken(2)

fit_transform(2)

getData(2)

get_next_non_whitespace(2)

is_operator(2)

joinSentences(1)

insert(1)

pop(1)

prepend(1)

toXML(1)

hasMoreTokens(1)

get_word_freq(1)

look2ahead(1)

get_tokens_from_file(1)

get_text_tokens(1)

lookahead(1)

tokenizeStr(1)

nltk_tokenize(1)

Tokenize(1)

getWordToInd(1)

context_window(1)

anchorScore(1)

build(1)

calculate_similarity(1)

ckip(1)

clean(1)

cleanText(1)

clear(1)

common_mentions(1)

common_terms(1)

convert_ids_to_tokens(1)

getWordMap(1)

execute(1)

generate(1)

getFixed(1)

getIndToWord(1)

getIterator(1)

getIterlimit(1)

getTestInput(1)

Example #1

Show file

File: Inference.py Project: damo894127201/TextSummarization

    target_path = '../data/test/target.txt'
    vocab_path = '../data/vocab.txt'
    model_path = '../model/model.pth'
    tokenizer = Tokenizer(vocab_path)
    config = Config()
    fr = open('../result/test.txt','w',encoding='utf-8-sig') # 存储预测结果

    loader = DataLoader(dataset=MyDataSet(source_path, target_path, tokenizer), batch_size=config.batch_size, shuffle=True,
                        num_workers=2,collate_fn=pad,drop_last=False) # 最后一个batch数据集不丢弃
    if not torch.cuda.is_available():
        print('No cuda is available!')
        exit()
    device = torch.device('cuda:0')
    model = Seq2Seq(config)
    model.to(device)
    # 加载模型
    checkpoint = torch.load(model_path,map_location=device)
    model.load_state_dict(checkpoint['model'])

    for iter, (batch_x, batch_y, batch_source_lens,batch_target_lens) in enumerate(loader):
        batch_x = batch_x.cuda()
        batch_source_lens = torch.as_tensor(batch_source_lens)
        # 预测结果和相应时刻的注意力权重
        results = model.BatchSample(batch_x,batch_source_lens)
        for i in range(len(results)):
            words = tokenizer.convert_ids_to_tokens(results[i])
            if i % 100 == 0:
                print(''.join(words))
            fr.write(''.join(words))
            fr.write('\n')
    fr.close()