Python ConstantReward.get_reward 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: edge.reward

클래스/타입: ConstantReward

메소드/함수: get_reward

hotexamples.com에서의 예제들: 2

Python ConstantReward.get_reward - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 edge.reward.ConstantReward.get_reward에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

ConstantReward(9)

get_reward(2)

자주 사용되는 메소드들

ConstantReward (9)

get_reward (2)

예제 #1

파일 보기

파일: reward_test.py 프로젝트: sheim/edge

    def test_constant(self):
        space = StateActionSpace(*Box(0, 1, (10, 10)).sets)

        reward = ConstantReward(space, 10)

        total = 0
        for t in range(10):
            s, a = space.get_tuple(space.sample())
            total += reward.get_reward(s, a, space.state_space.sample(), False)
        self.assertEqual(total, 100)

예제 #2

파일 보기

파일: reward_test.py 프로젝트: sheim/edge

    def test_unrewarded(self):
        space = StateActionSpace(*Box(0, 1, (10, 10)).sets)
        # `rewarded` should be a Subspace, but this is not implemented yet
        rewarded = StateActionSpace(*Box([0, 0], [0.5, 0.5], (10, 10)).sets)
        unrewarded = StateActionSpace(*Box([0.5, 0.5], [1, 1], (10, 10)).sets)

        reward = ConstantReward(space, 10, unrewarded_set=unrewarded)

        total = 0
        for t in range(10):
            s, a = space.get_tuple(space.sample())
            sampled_space = rewarded.state_space if t % 2 == 0 \
                else unrewarded.state_space

            total += reward.get_reward(s, a, sampled_space.sample(), False)
        self.assertEqual(total, 50)