Exemplos de copy_to_model_parallel_region em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: fairseq.model_parallel.megatron.mpu

Método / Função: copy_to_model_parallel_region

Exemplos em hotexamples.com: 3

copy_to_model_parallel_region em Python - 3 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de fairseq.model_parallel.megatron.mpu.copy_to_model_parallel_region em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Exemplo n.º 1

0

Exibir arquivo

Arquivo: transformer.py Projeto: NJUNLP/TMM-for-MAMS

def output_layer(self, features, **kwargs): """Project features to the vocabulary size.""" features = copy_to_model_parallel_region(features) # project back to size of vocabulary if self.share_input_output_embed: x = F.linear(features, self.embed_tokens.weight) else: x = F.linear(features, self.embed_out) if getattr(self.args, 'criterion') != 'vocab_parallel_cross_entropy': x = gather_from_model_parallel_region(x).contiguous() return x

Exemplo n.º 2

0

Exibir arquivo

def output_layer(self, features, **kwargs): """Project features to the vocabulary size.""" if not self.share_input_output_embed: raise NotImplementedError( "Model parallel training currently requires --share-decoder-input-output-embed" ) features = copy_to_model_parallel_region(features) # project back to size of vocabulary x = self.output_projection(features) if getattr(self.args, "criterion") != "vocab_parallel_cross_entropy": x = gather_from_model_parallel_region(x).contiguous() return x

Exemplo n.º 3

0

Exibir arquivo

Arquivo: model.py Projeto: 201611681000/correlation_mt

def forward(self, features, masked_tokens=None, **kwargs): # Only project the unmasked tokens while training, # saves both memory and computation if masked_tokens is not None: features = features[masked_tokens, :] x = self.dense(features) x = self.activation_fn(x) x = self.layer_norm(x) x = copy_to_model_parallel_region(x) # project back to size of vocabulary with bias x = F.linear(x, self.weight) x = gather_from_model_parallel_region(x).contiguous() x = x + self.bias return x