Python PPO.evaluate Exemples

Langage de programmation: Python

Espace de nommage/Pack: ray.rllib.algorithms.ppo

Class/Type: PPO

Méthode/Fonction: evaluate

Exemples au hotexamples.com: 2

Python PPO.evaluate - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de ray.rllib.algorithms.ppo.PPO.evaluate extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

PPO(12)

train(9)

restore(6)

compute_single_action(4)

get_weights(4)

save(3)

set_weights(3)

stop(3)

evaluate(2)

get_policy(2)

default_resource_request(1)

Méthodes fréquemment utilisées

PPO (12)

train (9)

restore (6)

compute_single_action (4)

get_weights (4)

save (3)

set_weights (3)

stop (3)

evaluate (2)

get_policy (2)

Méthodes fréquemment utilisées

default_resource_request (1)

Exemple #1

0

Afficher le fichier

"framework": "tf", # Tweak the default model provided automatically by RLlib, # given the environment's observation- and action spaces. "model": { "fcnet_hiddens": [64, 64], "fcnet_activation": "relu", }, # Set up a separate evaluation worker set for the # `algo.evaluate()` call after training (see below). "evaluation_num_workers": 1, # Only for evaluation runs, render the env. "evaluation_config": { "render_env": True, }, } # Create our RLlib Trainer. algo = PPO(config=config) # Run it for n training iterations. A training iteration includes # parallel sample collection by the environment workers as well as # loss calculation on the collected batch and a model update. for _ in range(3): print(algo.train()) # Evaluate the trained Trainer (and render each timestep to the shell's # output). algo.evaluate() # __rllib-in-60s-end__

Exemple #2

0

Afficher le fichier

"framework": "tf", # Tweak the default model provided automatically by RLlib, # given the environment's observation- and action spaces. "model": { "fcnet_hiddens": [64, 64], "fcnet_activation": "relu", }, # Set up a separate evaluation worker set for the # `trainer.evaluate()` call after training (see below). "evaluation_num_workers": 1, # Only for evaluation runs, render the env. "evaluation_config": { "render_env": True, }, } # Create our RLlib Trainer. trainer = PPO(config=config) # Run it for n training iterations. A training iteration includes # parallel sample collection by the environment workers as well as # loss calculation on the collected batch and a model update. for _ in range(3): print(trainer.train()) # Evaluate the trained Trainer (and render each timestep to the shell's # output). trainer.evaluate() # __rllib-in-60s-end__