Ejemplos de Normal.squeeze en Python

Lenguaje de programación: Python

Namespace/Package Name: torch.distributions

Clase / Tipo: Normal

Método / Función: squeeze

Ejemplos en hotexamples.com: 1

Python Normal.squeeze - 1 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de torch.distributions.Normal.squeeze extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

Normal(30)

entropy(30)

sample(30)

rsample(30)

log_prob(30)

sum(30)

cdf(30)

sample_n(16)

size(6)

icdf(5)

permute(4)

expand(4)

loc(4)

detach(4)

cpu(4)

mean(4)

scale(3)

gather(3)

backward(3)

reshape(2)

mode(2)

new_zeros(2)

item(2)

log(2)

exp(2)

add_(1)

pow(1)

sort(1)

square(1)

squeeze(1)

perplexity(1)

clamp(1)

numpy(1)

max(1)

clamp_(1)

__init__(1)

float(1)

flatten(1)

chunk(1)

cuda(1)

contiguous(1)

clip(1)

clamp_min_(1)

logsumexp(1)

Ejemplo n.º 1

Mostrar archivo

    def evaluate_action(self, state):
        '''
        evaluate action within GPU graph, for gradients flowing through it
        '''
        state = torch.FloatTensor(state).unsqueeze(0).to(device) # state dim: (N, dim of state)
        if DETERMINISTIC:
            action = self.forward(state)
            return action.detach().cpu().numpy()

        elif DISCRETE and not DETERMINISTIC:  # actor-critic (discrete)
            probs = self.forward(state)
            m = Categorical(probs)
            action = m.sample().to(device)
            log_prob = m.log_prob(action)

            return action.detach().cpu().numpy(), log_prob.squeeze(0), m.entropy().mean()

        elif not DISCRETE and not DETERMINISTIC: # soft actor-critic (continuous)
            self.action_range = 30.
            self.epsilon = 1e-6

            mean, log_std = self.forward(state)
            std = log_std.exp()
            normal = Normal(0, 1)
            z = normal.sample().to(device)
            action0 = torch.tanh(mean + std*z.to(device)) # TanhNormal distribution as actions; reparameterization trick
            action = self.action_range * action0
            
            log_prob = Normal(mean, std).log_prob(mean+ std*z.to(device)) - torch.log(1. - action0.pow(2) + self.epsilon) -  np.log(self.action_range)            
            log_prob = log_prob.sum(dim=1, keepdim=True)
            print('mean: ', mean, 'log_std: ', log_std)
            # return action.item(), log_prob, z, mean, log_std
            return action.detach().cpu().numpy().squeeze(0), log_prob.squeeze(0), Normal(mean, std).entropy().mean()