Python nl_to_partial_tokens 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: encoder_decoder.data_utils

메소드/함수: nl_to_partial_tokens

hotexamples.com에서의 예제들: 2

Python nl_to_partial_tokens - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 encoder_decoder.data_utils.nl_to_partial_tokens에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: decode_tools.py 프로젝트: hpplinux/nl2bash

def query_to_encoder_features(sentence, vocabs, FLAGS):
    """
    Convert a natural language query into feature vectors used by the encoder.
    """
    if FLAGS.channel == 'char':
        tokens = data_utils.nl_to_characters(sentence)
        init_vocab = data_utils.CHAR_INIT_VOCAB
    elif FLAGS.channel == 'partial.token':
        tokens = data_utils.nl_to_partial_tokens(sentence, tokenizer.basic_tokenizer)
        init_vocab = data_utils.TOKEN_INIT_VOCAB
    else:
        if FLAGS.normalized:
            tokens = data_utils.nl_to_tokens(sentence, tokenizer.ner_tokenizer)
        else:
            tokens = data_utils.nl_to_tokens(sentence, tokenizer.basic_tokenizer)
        init_vocab = data_utils.TOKEN_INIT_VOCAB
    sc_ids = data_utils.tokens_to_ids(tokens, vocabs.sc_vocab)
    encoder_features = [[sc_ids]]
    if FLAGS.use_copy and FLAGS.copy_fun == 'copynet':
        csc_ids = []
        for i, t in enumerate(tokens):
            if not t in init_vocab and t in vocabs.tg_vocab:
                csc_ids.append(vocabs.tg_vocab[t])
            else:
                csc_ids.append(len(vocabs.tg_vocab) + i)
        encoder_features.append([csc_ids])
    return encoder_features

예제 #2

파일 보기

파일: decode_tools.py 프로젝트: hpplinux/nl2bash

def query_to_copy_tokens(sentence, FLAGS):
    if FLAGS.channel == 'char':
        tokens = data_utils.nl_to_characters(sentence)
    elif FLAGS.channel == 'partial.token':
        tokens = data_utils.nl_to_partial_tokens(
            sentence, tokenizer.basic_tokenizer, to_lower_case=False,
            lemmatization=False)
    else:
        tokens = data_utils.nl_to_tokens(
            sentence, tokenizer.basic_tokenizer, to_lower_case=False,
            lemmatization=False)
    return tokens