Python ExperienceBatcher.experiences_to_batches Exemples

Langage de programmation: Python

Espace de nommage/Pack: py_2048_rl.learning.experience_batcher

Méthode/Fonction: experiences_to_batches

Exemples au hotexamples.com: 2

Python ExperienceBatcher.experiences_to_batches - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de py_2048_rl.learning.experience_batcher.ExperienceBatcher.experiences_to_batches extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

ExperienceBatcher(2)

experiences_to_batches(1)

get_batches_stepwise(1)

Méthodes fréquemment utilisées

ExperienceBatcher (2)

experiences_to_batches (1)

get_batches_stepwise (1)

Exemple #1

0

Afficher le fichier

Fichier : test_experience_batcher.py Projet : georgwiese/2048-rl

def test_experiences_to_batches(target_computer_class_mock): compute = target_computer_class_mock.return_value.compute compute.return_value = np.array([42, 43]) state1 = np.arange(16).reshape((4, 4)) + 1 state2 = np.arange(16).reshape((4, 4)) + 2 state3 = np.arange(16).reshape((4, 4)) + 3 experiences = [Experience(state1, 1, 2, state2, False, False, [3]), Experience(state2, 3, 4, state3, True, False, [])] run_inference = Mock(side_effect=[np.array([[0, 0, 0, -0.5], [0, 0, 0, 0]])]) batcher = ExperienceBatcher(None, run_inference, None, 1.0 / 15.0) state_batch, targets, actions = batcher.experiences_to_batches(experiences) reward_batch = np.array([2, 4]) bad_action_batch = np.array([False, True]) next_state_batch = np.array([state2.flatten(), state3.flatten()]) / 15.0 available_actions_batch = np.array([[False, False, False, True], [False, False, False, False]]) assert (compute.call_args_list[0][0][0] == reward_batch).all() assert (compute.call_args_list[0][0][1] == bad_action_batch).all() assert (compute.call_args_list[0][0][2] == next_state_batch).all() assert (compute.call_args_list[0][0][3] == available_actions_batch).all() expected_state_batch = np.array([state1.flatten(), state2.flatten()]) / 15.0 assert (state_batch == expected_state_batch).all() assert (targets == np.array([42, 43])).all() assert (actions == np.array([1, 3])).all()

Exemple #2

0

Afficher le fichier

def test_experiences_to_batches(target_computer_class_mock): compute = target_computer_class_mock.return_value.compute compute.return_value = np.array([42, 43]) state1 = np.arange(16).reshape((4, 4)) + 1 state2 = np.arange(16).reshape((4, 4)) + 2 state3 = np.arange(16).reshape((4, 4)) + 3 experiences = [ Experience(state1, 1, 2, state2, False, False, [3]), Experience(state2, 3, 4, state3, True, False, []) ] run_inference = Mock( side_effect=[np.array([[0, 0, 0, -0.5], [0, 0, 0, 0]])]) batcher = ExperienceBatcher(None, run_inference, None, 1.0 / 15.0) state_batch, targets, actions = batcher.experiences_to_batches(experiences) reward_batch = np.array([2, 4]) bad_action_batch = np.array([False, True]) next_state_batch = np.array([state2.flatten(), state3.flatten()]) / 15.0 available_actions_batch = np.array([[False, False, False, True], [False, False, False, False]]) assert (compute.call_args_list[0][0][0] == reward_batch).all() assert (compute.call_args_list[0][0][1] == bad_action_batch).all() assert (compute.call_args_list[0][0][2] == next_state_batch).all() assert (compute.call_args_list[0][0][3] == available_actions_batch).all() expected_state_batch = np.array([state1.flatten(), state2.flatten()]) / 15.0 assert (state_batch == expected_state_batch).all() assert (targets == np.array([42, 43])).all() assert (actions == np.array([1, 3])).all()