Python flatten_batchの例

プログラミング言語: Python

名前空間/パッケージ名: garage.torch.utils

メソッド/関数: flatten_batch

hotexamples.comのコード掲載数: 2

Python flatten_batch - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのgarage.torch.utils.flatten_batchの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

コード例 #1

ファイルを表示

ファイル: centralized_ma_ppo.py プロジェクト: parachutel/DICG

    def _compute_kl_constraint(self, obs, avail_actions, actions=None):
        """Compute KL divergence.

        Compute the KL divergence between the old policy distribution and
        current policy distribution.

        Args:
            obs (torch.Tensor): Observation from the environment.

        Returns:
            torch.Tensor: Calculated mean KL divergence.

        """
        if self.policy.recurrent:
            with torch.no_grad():
                if hasattr(self.policy, 'dicg'):
                    old_dist, _ = self._old_policy.forward(
                        obs, avail_actions, actions)
                else:
                    old_dist = self._old_policy.forward(
                        obs, avail_actions, actions)

            if hasattr(self.policy, 'dicg'):
                new_dist, _ = self.policy.forward(obs, avail_actions, actions)
            else:
                new_dist = self.policy.forward(obs, avail_actions, actions)

        else:
            flat_obs = flatten_batch(obs)
            flat_avail_actions = flatten_batch(avail_actions)
            with torch.no_grad():
                if hasattr(self.policy, 'dicg'):
                    old_dist, _ = self._old_policy.forward(
                        flat_obs, flat_avail_actions)
                else:
                    old_dist = self._old_policy.forward(
                        flat_obs, flat_avail_actions)

            if hasattr(self.policy, 'dicg'):
                new_dist, _ = self.policy.forward(flat_obs, flat_avail_actions)
            else:
                new_dist = self.policy.forward(flat_obs, flat_avail_actions)

        kl_constraint = torch.distributions.kl.kl_divergence(
            old_dist, new_dist)

        return kl_constraint.mean()

コード例 #2

ファイルを表示

    def _compute_kl_constraint(self, obs):
        """Compute KL divergence.

        Compute the KL divergence between the old policy distribution and
        current policy distribution.

        Args:
            obs (torch.Tensor): Observation from the environment.

        Returns:
            torch.Tensor: Calculated mean KL divergence.

        """
        flat_obs = flatten_batch(obs)
        with torch.no_grad():
            old_dist = self._old_policy.forward(flat_obs)

        new_dist = self.policy.forward(flat_obs)

        kl_constraint = torch.distributions.kl.kl_divergence(
            old_dist, new_dist)

        return kl_constraint.mean()