Python postprocess_nstep_and_prio Exemples

Langage de programmation: Python

Espace de nommage/Pack: ray.rllib.algorithms.dqn.dqn_tf_policy

Méthode/Fonction: postprocess_nstep_and_prio

Exemples au hotexamples.com: 2

Python postprocess_nstep_and_prio - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de ray.rllib.algorithms.dqn.dqn_tf_policy.postprocess_nstep_and_prio extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Exemple #1

0

Afficher le fichier

Fichier : sac_tf_policy.py Projet : smorad/ray

def postprocess_trajectory( policy: Policy, sample_batch: SampleBatch, other_agent_batches: Optional[Dict[AgentID, SampleBatch]] = None, episode: Optional[Episode] = None, ) -> SampleBatch: """Postprocesses a trajectory and returns the processed trajectory. The trajectory contains only data from one episode and from one agent. - If `config.batch_mode=truncate_episodes` (default), sample_batch may contain a truncated (at-the-end) episode, in case the `config.rollout_fragment_length` was reached by the sampler. - If `config.batch_mode=complete_episodes`, sample_batch will contain exactly one episode (no matter how long). New columns can be added to sample_batch and existing ones may be altered. Args: policy (Policy): The Policy used to generate the trajectory (`sample_batch`) sample_batch (SampleBatch): The SampleBatch to postprocess. other_agent_batches (Optional[Dict[AgentID, SampleBatch]]): Optional dict of AgentIDs mapping to other agents' trajectory data (from the same episode). NOTE: The other agents use the same policy. episode (Optional[Episode]): Optional multi-agent episode object in which the agents operated. Returns: SampleBatch: The postprocessed, modified SampleBatch (or a new one). """ return postprocess_nstep_and_prio(policy, sample_batch)

Exemple #2

0

Afficher le fichier

Fichier : ddpg_tf_policy.py Projet : ray-project/ray

def postprocess_trajectory( self, sample_batch: SampleBatch, other_agent_batches: Optional[Dict[Any, SampleBatch]] = None, episode: Optional[Episode] = None, ) -> SampleBatch: return postprocess_nstep_and_prio(self, sample_batch, other_agent_batches, episode)