Python ClippedObjective示例

编程语言: Python

命名空间/包名称: trax.rl.rl_layers

方法/功能: ClippedObjective

hotexamples.com的示例: 2

Python ClippedObjective - 已找到2个示例。这些是从开源项目中提取的最受好评的trax.rl.rl_layers.ClippedObjective现实Python示例。您可以评价示例，以帮助我们提高示例质量。

示例#1

显示文件

文件： actor_critic_joint.py 项目： hugochan/trax

 def ClippedObjectiveMean(
     dist_inputs, values, returns, actions, old_log_probs):
   """Clipped objective from the PPO algorithm."""
   advantages = returns - values
   probs_ratio = rl_layers.ProbsRatio(
       dist_inputs, actions, old_log_probs,
       log_prob_fun=self._policy_dist.log_prob)
   clipped_objective = rl_layers.ClippedObjective(
       probs_ratio, advantages, epsilon=self._epsilon)
   return jnp.mean(clipped_objective)

示例#2

显示文件

文件： actor_critic_joint.py 项目： srush/trax

 def f(dist_inputs, values, returns, actions, old_log_probs):
   """Clipped objective from the PPO algorithm."""
   advantages = returns - values
   probs_ratio = rl_layers.ProbsRatio(
       dist_inputs, actions, old_log_probs,
       log_prob_fun=self._policy_dist.log_prob)
   # advantages are of the shape [128,1,1]
   # and probs_ratio are of the shape [128,1]
   advantages = advantages.squeeze(axis=2)
   clipped_objective = rl_layers.ClippedObjective(
       probs_ratio, advantages, epsilon=self._epsilon)
   return jnp.mean(clipped_objective)