Python FeatureData.dim示例

编程语言: Python

命名空间/包名称: reagent.core.types

类/类型: FeatureData

方法/功能: dim

hotexamples.com的示例: 2

Python FeatureData.dim - 已找到2个示例。这些是从开源项目中提取的最受好评的reagent.core.types.FeatureData.dim现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

FeatureData(8)

dim(2)

repeat_interleave(2)

concat_user_doc(1)

float(1)

get_ranking_state(1)

get_tiled_batch(1)

size(1)

示例#1

显示文件

 def act(self,
         obs: rlt.FeatureData,
         possible_actions_mask: Optional[np.ndarray] = None
         ) -> rlt.ActorOutput:
     """Act randomly regardless of the observation."""
     # pyre-fixme[35]: Target cannot be annotated.
     obs: torch.Tensor = obs.float_features
     assert obs.dim() >= 2, f"obs has shape {obs.shape} (dim < 2)"
     batch_size = obs.size(0)
     # pyre-fixme[6]: Expected `Union[torch.Size, torch.Tensor]` for 1st param
     #  but got `Tuple[int]`.
     action = self.dist.sample((batch_size, ))
     # sum over action_dim (since assuming i.i.d. per coordinate)
     log_prob = self.dist.log_prob(action).sum(1)
     return rlt.ActorOutput(action=action, log_prob=log_prob)

示例#2

显示文件

    def act(self,
            obs: rlt.FeatureData,
            possible_actions_mask: Optional[np.ndarray] = None
            ) -> rlt.ActorOutput:
        """Act randomly regardless of the observation."""
        # pyre-fixme[35]: Target cannot be annotated.
        obs: torch.Tensor = obs.float_features
        assert obs.dim() >= 2, f"obs has shape {obs.shape} (dim < 2)"
        assert obs.shape[0] == 1, f"obs has shape {obs.shape} (0th dim != 1)"
        batch_size = obs.shape[0]
        scores = torch.ones((batch_size, self.num_actions))
        scores = apply_possible_actions_mask(scores,
                                             possible_actions_mask,
                                             invalid_score=0.0)

        # sample a random action
        m = torch.distributions.Categorical(scores)
        raw_action = m.sample()
        action = F.one_hot(raw_action, self.num_actions)
        log_prob = m.log_prob(raw_action).float()
        return rlt.ActorOutput(action=action, log_prob=log_prob)