Python Feedback 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: banditpylib.data_pb2

클래스/타입: Feedback

hotexamples.com에서의 예제들: 11

Python Feedback - 11개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 banditpylib.data_pb2.Feedback에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Feedback(11)

자주 사용되는 메소드들

Feedback (11)

예제 #1

파일 보기

파일: ucb_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        arm_num = 5
        horizon = 10
        learner = UCB(arm_num=arm_num)
        learner.reset()
        mock_ucb = np.array([1.2, 1, 1, 1, 1])
        # pylint: disable=protected-access
        learner._UCB__UCB = MagicMock(return_value=mock_ucb)

        # During the initial time steps, each arm is pulled once
        for time in range(1, arm_num + 1):
            assert learner.actions(
                Context()).SerializeToString() == text_format.Parse(
                    """
        arm_pulls <
          arm <
            id: {arm_id}
          >
          times: 1
        >
        """.format(arm_id=time - 1), Actions()).SerializeToString()
            learner.update(
                text_format.Parse(
                    """
        arm_feedbacks <
          arm <
            id: {arm_id}
          >
          rewards: 0
        >
        """.format(arm_id=time - 1), Feedback()))
        # For the left time steps, arm 0 is always the choice
        for _ in range(arm_num + 1, horizon + 1):
            assert learner.actions(
                Context()).SerializeToString() == text_format.Parse(
                    """
        arm_pulls <
          arm <
            id: 0
          >
          times: 1
        >
        """, Actions()).SerializeToString()
            learner.update(
                text_format.Parse(
                    """
        arm_feedbacks <
          arm <
            id: 0
          >
          rewards: 0
        >
        """, Feedback()))

예제 #2

파일 보기

    def test_simple_run(self):
        arm_num = 5
        horizon = 10
        learner = Uniform(arm_num=arm_num)
        learner.reset()

        for time in range(1, horizon + 1):
            assert learner.actions(
                Context()).SerializeToString() == text_format.Parse(
                    """
        arm_pulls <
          arm <
            id: {arm_id}
          >
          times: 1
        >
        """.format(arm_id=(time - 1) % arm_num),
                    Actions()).SerializeToString()
            learner.update(
                text_format.Parse(
                    """
        arm_feedbacks <
          arm <
            id: 0
          >
          rewards: 0
        >
        """, Feedback()))

예제 #3

파일 보기

 def feed(self, actions: Actions) -> Feedback:
     feedback = Feedback()
     for arm_pull in actions.arm_pulls:
         arm_feedback = self._take_action(arm_pull=arm_pull)
         if arm_feedback.rewards:
             feedback.arm_feedbacks.append(arm_feedback)
     return feedback

예제 #4

파일 보기

 def feed(self, actions: Actions) -> Feedback:
     feedback = Feedback()
     for arm_pull in actions.arm_pulls:
         if arm_pull.times > 0:
             arm_feedback = self._take_action(arm_pull=arm_pull)
             feedback.arm_feedbacks.append(arm_feedback)
     return feedback

예제 #5

파일 보기

파일: eps_greedy_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        means = [0, 0.5, 0.7, 1]
        arms = [BernoulliArm(mean) for mean in means]
        learner = EpsGreedy(arm_num=len(arms))
        learner.reset()

        # Pull each arm once during the initial steps
        for time in range(1, len(arms) + 1):
            assert learner.actions(
                Context()).SerializeToString() == text_format.Parse(
                    """
        arm_pulls <
          arm <
            id: {arm_id}
          >
          times: 1
        >
        """.format(arm_id=time - 1), Actions()).SerializeToString()
            learner.update(
                text_format.Parse(
                    """
        arm_feedbacks <
          arm <
            id: {arm_id}
          >
          rewards: 0
        >
        """.format(arm_id=time - 1), Feedback()))

예제 #6

파일 보기

파일: sh_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        arm_num = 5
        budget = 20
        learner = SH(arm_num=arm_num, budget=budget)
        learner.reset()

        while True:
            actions = learner.actions(Context())
            if not actions.arm_pulls:
                break

            feedback = Feedback()
            for arm_pull in actions.arm_pulls:
                arm_feedback = feedback.arm_feedbacks.add()
                arm_feedback.arm.id = arm_pull.arm.id
                arm_feedback.rewards.extend(list(np.zeros(arm_pull.times)))
            learner.update(feedback)
        assert learner.best_arm in list(range(arm_num))

예제 #7

파일 보기

파일: exp_gap_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        arm_num = 3
        confidence = 0.95
        learner = ExpGap(arm_num=arm_num, confidence=confidence)
        learner.reset()

        while True:
            actions = learner.actions(Context())
            if not actions.arm_pulls:
                break

            feedback = Feedback()
            for arm_pull in actions.arm_pulls:
                arm_feedback = feedback.arm_feedbacks.add()
                arm_feedback.arm.id = arm_pull.arm.id
                arm_feedback.rewards.extend(
                    list(
                        np.random.normal(arm_pull.arm.id / arm_num, 1,
                                         arm_pull.times)))
            learner.update(feedback)
        assert learner.best_arm in list(range(arm_num))

예제 #8

파일 보기

파일: explore_then_commit_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        arm_num = 5
        horizon = 10
        learner = ExploreThenCommit(arm_num=arm_num, T_prime=6)
        learner.reset()

        for _ in range(1, horizon + 1):
            actions = learner.actions(Context())
            assert len(actions.arm_pulls) == 1
            arm_pull = actions.arm_pulls[0]
            arm_id = arm_pull.arm.id
            assert arm_pull.times == 1
            learner.update(
                text_format.Parse(
                    """
        arm_feedbacks <
          arm <
            id: {arm_id}
          >
          rewards: 0
        >
        """.format(arm_id=arm_id), Feedback()))

예제 #9

파일 보기

    def test_simple_run(self):
        horizon = 10
        features = [
            np.array([1, 0]),
            np.array([1, 0]),
            np.array([1, 0]),
            np.array([1, 0]),
            np.array([0, 1])
        ]
        learner = LinUCB(features, 0.1, 1e-3)
        learner.reset()
        mock_ucb = np.array([1.2, 1, 1, 1, 1])
        # pylint: disable=protected-access
        learner._LinUCB__LinUCB = MagicMock(return_value=mock_ucb)

        # Always 0th arm is picked
        # not the most efficient test
        for _ in range(1, horizon + 1):
            assert learner.actions(
                Context()).SerializeToString() == text_format.Parse(
                    """
            arm_pulls <
              arm <
                id: 0
              >
              times: 1
            >
            """, Actions()).SerializeToString()
            learner.update(
                text_format.Parse(
                    """
            arm_feedbacks <
              arm <
                id: 0
              >
              rewards: 0
            >
            """, Feedback()))

예제 #10

파일 보기

파일: apt_test.py 프로젝트: sheelfshah/banditpylib

  def test_actions(self):
    # Test actions are in the right range
    arm_num = 10
    budget = 15
    apt = APT(arm_num=arm_num, theta=0.5, eps=0)
    apt.reset()
    for _ in range(budget):
      actions = apt.actions(Context())
      assert len(actions.arm_pulls) == 1

      arm_id = actions.arm_pulls[0].arm.id
      assert 0 <= arm_id < arm_num

      apt.update(
          text_format.Parse(
              """
        arm_feedbacks <
          arm <
            id: {arm_id}
          >
          rewards: 0
        >
        """.format(arm_id=arm_id), Feedback()))

예제 #11

파일 보기

파일: lilucb_heur_collaborative_test.py 프로젝트: sheelfshah/banditpylib

    def test_simple_run(self):
        arm_num = 3
        confidence = 0.95
        learner = CentralizedLilUCBHeuristic(arm_num=arm_num,
                                             confidence=confidence,
                                             assigned_arms=np.arange(arm_num))
        learner.reset()

        while True:
            actions = learner.actions()
            if not actions.arm_pulls:
                break

            feedback = Feedback()
            for arm_pull in actions.arm_pulls:
                arm_feedback = feedback.arm_feedbacks.add()
                arm_feedback.arm.id = arm_pull.arm.id
                arm_feedback.rewards.extend(
                    list(
                        np.random.normal(arm_pull.arm.id / arm_num, 1,
                                         arm_pull.times)))
            learner.update(feedback)

        assert learner.best_arm in list(range(arm_num))