Python RandomAlfEnvironment.reset 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: alf.environments.random_alf_environment

클래스/타입: RandomAlfEnvironment

메소드/함수: reset

hotexamples.com에서의 예제들: 5

Python RandomAlfEnvironment.reset - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 alf.environments.random_alf_environment.RandomAlfEnvironment.reset에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

RandomAlfEnvironment(9)

step(8)

reset(5)

_done(3)

current_time_step(1)

render(1)

예제 #1

파일 보기

파일: alf_environment_test.py 프로젝트: zhuboli/alf

    def testStepSavesCurrentTimeStep(self):
        obs_spec = BoundedTensorSpec((1, ), torch.int32)
        action_spec = BoundedTensorSpec((1, ), torch.int64)

        random_env = RandomAlfEnvironment(observation_spec=obs_spec,
                                          action_spec=action_spec)

        random_env.reset()
        time_step = random_env.step(action=torch.ones((1, )))
        current_time_step = random_env.current_time_step()
        nest.map_structure(self.assertEqual, time_step, current_time_step)

예제 #2

파일 보기

 def testRewardCheckerBatchSizeOne(self):
     # Ensure batch size 1 with scalar reward works
     obs_spec = BoundedTensorSpec((2, 3), torch.int32, -10, 10)
     action_spec = BoundedTensorSpec((1, ), torch.int64)
     env = RandomAlfEnvironment(obs_spec,
                                action_spec,
                                reward_fn=lambda *_: np.array([1.0]),
                                batch_size=1)
     env._done = False
     env.reset()
     action = torch.tensor([0], dtype=torch.int64)
     time_step = env.step(action)
     self.assertEqual(time_step.reward, 1.0)

예제 #3

파일 보기

 def testCustomRewardFn(self):
     obs_spec = BoundedTensorSpec((2, 3), torch.int32, -10, 10)
     action_spec = BoundedTensorSpec((1, ), torch.int64)
     batch_size = 3
     env = RandomAlfEnvironment(obs_spec,
                                action_spec,
                                reward_fn=lambda *_: np.ones(batch_size),
                                batch_size=batch_size)
     env._done = False
     env.reset()
     action = torch.ones(batch_size)
     time_step = env.step(action)
     self.assertSequenceAlmostEqual([1.0] * 3, time_step.reward)

예제 #4

파일 보기

 def testRewardCheckerSizeMismatch(self):
     # Ensure custom scalar reward with batch_size greater than 1 raises
     # ValueError
     obs_spec = BoundedTensorSpec((2, 3), torch.int32, -10, 10)
     action_spec = BoundedTensorSpec((1, ), torch.int64)
     env = RandomAlfEnvironment(obs_spec,
                                action_spec,
                                reward_fn=lambda *_: np.array([1.0]),
                                batch_size=5)
     env.reset()
     env._done = False
     action = torch.tensor(0, dtype=torch.int64)
     with self.assertRaises(ValueError):
         env.step(action)

예제 #5

파일 보기

    def testRendersImage(self):
        action_spec = BoundedTensorSpec((1, ), torch.int64, -10, 10)
        observation_spec = BoundedTensorSpec((1, ), torch.int32, -10, 10)
        env = RandomAlfEnvironment(observation_spec,
                                   action_spec,
                                   render_size=(4, 4, 3))

        env.reset()
        img = env.render()

        self.assertTrue(np.all(img < 256))
        self.assertTrue(np.all(img >= 0))
        self.assertEqual((4, 4, 3), img.shape)
        self.assertEqual(np.uint8, img.dtype)