Python Normal.detach Exemples

Langage de programmation: Python

Espace de nommage/Pack: torch.distributions.normal

Class/Type: Normal

Méthode/Fonction: detach

Exemples au hotexamples.com: 4

Python Normal.detach - 4 exemples trouvés. Ce sont les exemples réels les mieux notés de torch.distributions.normal.Normal.detach extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

Normal(30)

rsample(30)

sample(30)

log_prob(30)

cdf(30)

entropy(21)

icdf(18)

detach(4)

sum(3)

expand(3)

mean(2)

sample_n(2)

cpu(2)

mode(2)

view(2)

backward(1)

detach_(1)

mul(1)

item(1)

expected_mean(1)

exp(1)

size(1)

squeeze(1)

dot(1)

tanh(1)

total_variance(1)

Méthodes fréquemment utilisées

Normal (30)

rsample (30)

sample (30)

log_prob (30)

cdf (30)

entropy (21)

icdf (18)

detach (4)

sum (3)

expand (3)

Méthodes fréquemment utilisées

mean (2)

sample_n (2)

cpu (2)

mode (2)

view (2)

backward (1)

detach_ (1)

mul (1)

item (1)

expected_mean (1)

exp (1)

size (1)

squeeze (1)

dot (1)

tanh (1)

total_variance (1)

Méthodes fréquemment utilisées

exp (1)

size (1)

squeeze (1)

dot (1)

tanh (1)

total_variance (1)

Exemple #1

0

Afficher le fichier

Fichier : HCA.py Projet : Pechckin/HCA

def update(self): if not self.memory.full(): return batch = self.memory.sample() Zs = self.discount(batch.r) # Policy mu_policy, sigma_policy = self.policy(batch.s) log_prob_policy = Normal(mu_policy, sigma_policy).log_prob( batch.a).mean(dim=1, keepdims=True) # Credit mu_credit, sigma_credit = self.credit(batch.s, Zs) log_prob_credit = Normal(mu_credit, sigma_credit).log_prob( batch.a).mean(dim=1, keepdims=True) ratio = torch.exp(log_prob_policy - log_prob_credit.detach()) A = (1 - ratio) * Zs.unsqueeze(1) policy_loss = -(A.T @ log_prob_policy) / batch.r.size(0) self.policy_optim.zero_grad() policy_loss.backward() credit_loss = -torch.mean(log_prob_policy.detach() * log_prob_credit) self.credit_optim.zero_grad() credit_loss.backward() torch.nn.utils.clip_grad_norm_(self.policy.parameters(), 0.7) torch.nn.utils.clip_grad_norm_(self.credit.parameters(), 0.7) self.policy_optim.step() self.credit_optim.step()

Exemple #2

0

Afficher le fichier

Fichier : utils.py Projet : TianhongDai/esil-hindsight

def select_actions(pi, dist_type): if dist_type == 'gauss': mean, std = pi actions = Normal(mean, std).sample() else: raise NotImplementedError # return actions return actions.detach().cpu().numpy().squeeze()

Exemple #3

0

Afficher le fichier

Fichier : utils.py Projet : zqyuan/reinforcement-learning-algorithms

def select_actions(pi, dist_type): if dist_type == 'gauss': mean, std = pi actions = Normal(mean, std).sample() elif dist_type == 'beta': alpha, beta = pi actions = Beta(alpha.detach().cpu(), beta.detach().cpu()).sample() return actions.detach().cpu().numpy()[0]

Exemple #4

0

Afficher le fichier

Fichier : trpo_agent.py Projet : zqyuan/reinforcement-learning-algorithms

def _action_selection(self, action_mean, action_std, exploration=True): if exploration: action = Normal(action_mean, action_std).sample() else: action = action_mean return action.detach().cpu().numpy()[0]