Python create_pretrain_dataset Examples

Programming Language: Python

Namespace/Package Name: official.bert.input_pipeline

Method/Function: create_pretrain_dataset

Examples at hotexamples.com: 3

Python create_pretrain_dataset - 3 examples found. These are the top rated real world Python examples of official.bert.input_pipeline.create_pretrain_dataset extracted from open source projects. You can rate examples to help us improve the quality of examples.

Example #1

Show file

    def _dataset_fn(ctx=None):
        del ctx

        input_files = []
        for input_pattern in input_file_pattern.split(','):
            input_files.extend(tf.io.gfile.glob(input_pattern))

        train_dataset = input_pipeline.create_pretrain_dataset(
            input_files, seq_length, max_predictions_per_seq, batch_size)
        return train_dataset

Example #2

Show file

def get_pretrain_input_data(input_file_pattern, seq_length,
                            max_predictions_per_seq, batch_size):
    """Returns input dataset from input file string."""

    input_files = []
    for input_pattern in input_file_pattern.split(','):
        input_files.extend(tf.io.gfile.glob(input_pattern))

    train_dataset = input_pipeline.create_pretrain_dataset(
        input_files, seq_length, max_predictions_per_seq, batch_size)
    return train_dataset

Example #3

Show file

File: run_pretraining.py Project: ljx411/my_code

    def _dataset_fn(ctx=None):
        """Returns tf.data.Dataset for distributed BERT pretraining."""
        input_files = []
        for input_pattern in input_file_pattern.split(','):
            input_files.extend(tf.io.gfile.glob(input_pattern))

        train_dataset = input_pipeline.create_pretrain_dataset(
            input_files,
            seq_length,
            max_predictions_per_seq,
            batch_size,
            is_training=True,
            input_pipeline_context=ctx)
        return train_dataset