Python Dqn.getBatch 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: dqn

클래스/타입: Dqn

메소드/함수: getBatch

hotexamples.com에서의 예제들: 2

Python Dqn.getBatch - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 dqn.Dqn.getBatch에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Dqn(12)

learn(3)

execute_action(2)

getBatch(2)

get_state(2)

to(2)

__init__(1)

addToMemory(1)

done_update(1)

nn(1)

open_orders(1)

remember(1)

예제 #1

파일 보기

            action = np.random.randint(0, 4)
        else:
            qvalues = model.predict(currentState)[0]
            action = np.argmax(qvalues)

        #Updating the Environment
        frame, reward, gameOver = env.step(action)

        #We need to reshape the frame(2D) to add it to the nextState (4D)
        frame = np.reshape(frame, (1, env.nColumns, env.nRows, 1))
        nextState = np.append(nextState, frame, axis=3)
        nextState = np.delete(nextState, 0, axis=3)

        #Remembering new experience and training the AI
        DQN.remember([currentState, action, reward, nextState], gameOver)
        inputs, targets = DQN.getBatch(model, batchSize)
        model.train_on_batch(inputs, targets)

        #Updating the score and current state
        if env.collected:
            nCollected += 1

        currentState = nextState

    #Updating the epsilon and saving the model
    epsilon -= epsilonDecayRate
    epsilon = max(epsilon, minLastEpsilon)

    if nCollected > maxNCollected and nCollected > 2:
        model.save(filePathToSave)
        maxNCollected = nCollected

예제 #2

파일 보기

파일: train.py 프로젝트: florianrougier/DQN_simple_games

        #Taking an action
        if np.random.rand() <= epsilon:
            action = np.random.randint(0, 3)
        else:
            qvalues = model.predict(currentState)[0]
            action = np.argmax(qvalues)

        #Updating the environment
        nextState[0], reward, gameOver, _ = env.step(action)
        env.render()

        totReward += reward

        #Remembering new experience, training the AI and updating current state
        dqn.remember([currentState, action, reward, nextState], gameOver)
        inputs, targets = dqn.getBatch(model, batchSize)
        model.train_on_batch(inputs, targets)

        currentState = nextState

    #Lowering epsilon and displaying the results
    epsilon *= epsilonDecayRate

    print('Epoch: ' + str(epoch) + ' Epsilon: {:.5f}'.format(epsilon) +
          ' Total Reward: {:.2f}'.format(totReward))

    rewards.append(totReward)
    totReward = 0

    plt.plot(rewards)
    plt.xlabel('Epoch')