def forward(self, batch): # pylint:disable=arguments-differ ''' A batch of inputs and targets ''' encoded, encoder_attn_weights_tensor = self.encode(batch['inputs']) decoded = self.decode( encoded, right_shift(right_shift(batch['targets']), shift=self.span - 1, fill=self.sos_idx), ) logits = decoded['logits'] dims = list(range(1, logits.dim())) targets = left_shift(batch['targets']) nll = self.cross_entropy(logits, targets).sum(dims[:-1]) smoothed_nll = self.label_smoothing(logits, targets).sum(dims) return { 'smoothed_nll': smoothed_nll, 'nll': nll, 'encoder_attn_weights_tensor': encoder_attn_weights_tensor, 'decoder_attn_weights_tensor': decoded['decoder_attn_weights_tensor'], 'enc_dec_attn_weights_tensor': decoded['enc_dec_attn_weights_tensor'] }
def forward(self, batch): # pylint:disable=arguments-differ ''' A batch of inputs and targets ''' decoded = self.decode(self.encode(batch['inputs']), right_shift(right_shift(batch['targets']), shift=self.span - 1, fill=self.sos_idx), input_lens=batch['input_lens']) logits = decoded['logits'] dims = list(range(1, logits.dim())) targets = left_shift(batch['targets']) nll = self.cross_entropy(logits, targets).sum(dims[:-1]) smoothed_nll = self.label_smoothing(logits, targets).sum(dims) return smoothed_nll, nll
def __init__(self, num_filters_in, num_filters_out, filter_size=(2, 2), stride=(1, 1), shift_output_right=False, norm='weight_norm'): super(down_right_shifted_conv2d, self).__init__() assert norm in [None, 'batch_norm', 'weight_norm'] self.pad = nn.ZeroPad2d((filter_size[1] - 1, 0, filter_size[0] - 1, 0)) self.conv = nn.Conv2d(num_filters_in, num_filters_out, filter_size, stride=stride) self.shift_output_right = shift_output_right self.norm = norm if norm == 'weight_norm': self.conv = wn(self.conv) elif norm == 'batch_norm': self.bn = nn.BatchNorm2d(num_filters_out) if shift_output_right: self.right_shift = lambda x: right_shift(x, pad=nn.ZeroPad2d( (1, 0, 0, 0)))
def forward(self, batch, global_mask=None): # pylint:disable=arguments-differ ''' batch: length x bsz''' batch = batch.transpose(1, 0) targets = left_shift(batch) decoded = self.decode(right_shift(batch), global_mask=global_mask) state = decoded['state'] if not self.adaptive: logits = self.embedding(state, reverse=True).transpose(2, 1) dims = list(range(1, logits.dim())) nll = self.cross_entropy(logits, targets).view(-1) smoothed_nll = self.label_smoothing(logits, targets).sum(dims) if not self.config.return_rank: return smoothed_nll, nll else: logits = logits.transpose(2, 1) assert targets.shape[0] == 1 targets = targets.squeeze(0) target_logits = logits[:, range(targets.shape[0]), targets] rank = (logits > target_logits.unsqueeze(-1)).sum(dim=-1) return rank, nll else: if self.config.batch_length < state.size(1): state = state[:, -self.config.batch_length:].contiguous() targets = targets[:, -self.config.batch_length:].contiguous() state = state.view(-1, state.shape[-1]) # (bsz*L, embed_dim) targets = targets.contiguous().view(-1) # (bsz*L, ) if not self.config.return_rank: nll = self.adaptive_softmax(state, targets, keep_order=True) smoothed_nll = nll return smoothed_nll, nll else: nll, rank = self.adaptive_softmax(state, targets, keep_order=True, return_rank=True) return rank, nll
def forward(self, batch): # pylint:disable=arguments-differ ''' A batch of inputs and targets ''' encoded = self.encode(batch['inputs']) decoded_annotation = self.decode_annotation( encoded, right_shift(batch['target_annotations'])) logits = decoded_annotation['logits'] loss = nll = self.annotation_cross_entropy( logits, left_shift(batch['target_annotations'])).sum( list(range(1, logits.dim() - 1))) if self.decoders is not None: decoded = self.decode(encoded, batch['masked_targets']) logits = decoded['logits'] nll += self.cross_entropy(logits, batch['targets']).sum( list(range(1, logits.dim() - 1))) loss += self.label_smoothing(logits, batch['targets']).sum( list(range(1, logits.dim()))) return loss, nll