Python choose_action_descrete 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: relaax.common.algorithms.lib.utils

메소드/함수: choose_action_descrete

hotexamples.com에서의 예제들: 3

Python choose_action_descrete - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 relaax.common.algorithms.lib.utils.choose_action_descrete에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: agent.py 프로젝트: zhly0/relaax

    def get_action_and_value_from_network(self):
        if da3c_config.config.use_lstm:
            action, value, lstm_state = \
                    self.session.op_get_action_value_and_lstm_state(state=[self.observation.queue],
                                                                    lstm_state=self.lstm_state,
                                                                    lstm_step=[1])
            condition = self.experience is not None and (len(
                self.experience) == da3c_config.config.batch_size
                                                         or self.terminal)
            if not condition:
                self.lstm_state = lstm_state
        else:
            action, value = self.session.op_get_action_and_value(
                state=[self.observation.queue])

        value, = value
        if len(action) == 1:
            if M:
                self.metrics.histogram('action', action)
            self.last_probs, = action
            return utils.choose_action_descrete(self.last_probs), value
        mu, sigma2 = action
        self.last_probs = mu
        if M:
            self.metrics.histogram('mu', mu)
            self.metrics.histogram('sigma2', sigma2)
        return utils.choose_action_continuous(
            mu, sigma2, da3c_config.config.output.action_low,
            da3c_config.config.output.action_high), value

예제 #2

파일 보기

 def action_from_policy(self, state):
     assert state is not None
     state = np.asarray(state)
     state = np.reshape(state, (1, ) + state.shape)
     probabilities, = self.session.op_get_action(state=state)
     return utils.choose_action_descrete(probabilities, self.exploit)

예제 #3

파일 보기

파일: model_api.py 프로젝트: deeplearninc/relaax

 def action_from_policy(self, state):
     probabilities, = self.session.op_get_action(state=[state])
     return utils.choose_action_descrete(probabilities, self.exploit)