Python Directions.getIndex примеры использования

Язык программирования: Python

Пространство имен/Пакет: game

Класс/Тип: Directions

Метод/Функция: getIndex

Примеров на hotexamples.com: 4

Python Directions.getIndex - 4 примера найдено. Это лучшие примеры Python кода для game.Directions.getIndex, полученные из open source проектов. Вы можете ставить оценку каждому примеру, чтобы помочь нам улучшить качество примеров.

Основные методы

Показать Скрыть

convToDirection(4)

getIndex(4)

fromIndex(3)

generator(3)

REVERSE(1)

asList(1)

Пример #1

Показать файл

Файл: qFunctionManagers.py Проект: diegomontoyas/DeepLearning-PacMan

    def update(self, transitionsBatch):
        """
        Update the Q-Values from the given batch of transitions
        :param transitionsBatch: List of tuples (qState, action, nextQState, reward, isStateFinal, list of legal actions)
        """

        trainingBatchQStates = []
        trainingBatchTargetQValues = []

        # Convert raw states to our q-states and calculate update policy for each transition in batch
        for aQState, anAction, aReward, aNextQState, isTerminal, nextStateLegalActions in transitionsBatch:

            # aReward = util.rescale(aReward, -510, 1000, -1, 1)

            actionsQValues = self.model.model.predict(np.array([aQState]))[0]
            targetQValues = actionsQValues.copy()

            # Update rule
            if isTerminal:
                updatedQValueForAction = aReward

            else:
                nextActionsQValues = self.model.model.predict(
                    np.array([aNextQState]))[0]
                nextStateLegalActionsIndices = [
                    Directions.getIndex(action)
                    for action in nextStateLegalActions
                ]

                try:
                    nextStateLegalActionsIndices.remove(4)
                except:
                    pass

                nextStateLegalActionsQValues = np.array(
                    nextActionsQValues)[nextStateLegalActionsIndices]
                maxNextActionQValue = max(nextStateLegalActionsQValues)
                updatedQValueForAction = (
                    aReward + self.trainingRoom.discount * maxNextActionQValue)

            targetQValues[Directions.getIndex(
                anAction)] = updatedQValueForAction

            trainingBatchQStates.append(aQState)
            trainingBatchTargetQValues.append(targetQValues)

        return self.model.model.train_on_batch(
            x=np.array(trainingBatchQStates),
            y=np.array(trainingBatchTargetQValues))

Пример #2

Показать файл

Файл: qFunctionManagers.py Проект: diegomontoyas/DeepLearning-PacMan

    def getAction(self, rawState, epsilon):

        legalActions = rawState.getLegalActions()
        legalActions.remove(Directions.STOP)

        if util.flipCoin(epsilon):
            return random.choice(legalActions)

        else:
            qValues = [(Directions.getIndex(action),
                        self.getQValue(rawState, action))
                       for action in legalActions]
            qValues = sorted(qValues, key=lambda x: x[1], reverse=True)

            for index, qValue in qValues:
                action = Directions.fromIndex(index)
                if action in legalActions:
                    return action

Пример #3

Показать файл

def getGhostDirections(state):
    return np.array([Directions.getIndex(s.getDirection()) for s in state.getGhostStates()]) / 4.0

Пример #4

Показать файл

 def remember(self, state, action, reward, nextState):
     from game import Directions
     self.replayMemory[str(state.__hash__()) +
                       str(Directions.getIndex(action))] = (state, action,
                                                            reward,
                                                            nextState)