Python TpgTrainer.getNextAgent 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: tpg.tpg_trainer

클래스/타입: TpgTrainer

메소드/함수: getNextAgent

hotexamples.com에서의 예제들: 2

Python TpgTrainer.getNextAgent - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 tpg.tpg_trainer.TpgTrainer.getNextAgent에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

TpgTrainer(7)

evolve(7)

getAllAgents(7)

applyScores(5)

getBestTeams(3)

getNextAgent(2)

getTaskScores(2)

createNewPopulation(1)

generateScoreStats(1)

multiEvolve(1)

remainingAgents(1)

예제 #1

파일 보기

파일: tpg_physics.py 프로젝트: eivinasbutkus/tpg-physics

def main():

    # initiates trainer, actions are these predictions of the agent:
    # action 1 -> right box will fall off the edge
    # action 0 -> right box will not fall off
    trainer = TpgTrainer(actions=[0, 1], teamPopSize=50)

    _min, _max, _avg = [], [], []  # hold values for every generation

    for gen in range(GENERATIONS):  # generation loop
        print("Generation: ", gen + 1, "/", GENERATIONS)
        curScores = []  # new list per gen

        while True:  # loop to go through agents
            agent = trainer.getNextAgent()
            if agent is None:
                break  # no more agents, so proceed to next gen

            # evaluting the agent
            score = 0
            for i in range(EVALUATIONS):
                score += evaluateAgent(agent)
            agent.reward(score)

            curScores.append(score)

        print("Min:", min(curScores), "   Max:", max(curScores), "   Avg:",
              sum(curScores) / len(curScores),
              "(out of " + str(EVALUATIONS) + ")\n")

        _min.append(min(curScores))
        _max.append(max(curScores))
        _avg.append(sum(curScores) / len(curScores))

        trainer.evolve()

    # getting best agent after all the generations
    best_agent, best_score = getBestAgent(trainer)

    print("Best agent's score:", best_score, "/", EVALUATIONS)

    for run in range(FINAL_RUNS):
        print("Final run: ", run + 1, "/", FINAL_RUNS, end='\r')
        evaluateAgent(best_agent, graphics=True)

    # plotting progress over the generations
    generations = range(1, GENERATIONS + 1)

    axes = plt.gca()
    axes.set_ylim([0, EVALUATIONS])

    plt.plot(generations, _min, label="min")
    plt.plot(generations, _max, label="max")
    plt.plot(generations, _avg, label="avg")

    plt.xlabel("generation")
    plt.ylabel("score")
    plt.legend()
    plt.show()

예제 #2

파일 보기

for gen in range(100):  # generation loop
    curScores = []  # new list per gen

    # get right env in envQueue
    game = gameQueue.pop()  # take out last game
    print('playing on', game)
    env = envs[game]
    # re-get games list
    if len(gameQueue) == 0:
        gameQueue = list(allGames)
        random.shuffle(gameQueue)

    while True:  # loop to go through agents
        teamNum = trainer.remainingAgents()
        agent = trainer.getNextAgent()
        if agent is None:
            break  # no more agents, so proceed to next gen

        # check if agent already has score
        if agent.taskDone():
            score = agent.getOutcome()
        else:
            state = env.reset()  # get initial state and prep environment
            score = 0
            valActs = range(env.action_space.n)
            for i in range(1000):

                act = agent.act(getState(state),
                                valActs=valActs)  # get action from agent