Python Agent.learn 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: src.agent

클래스/타입: Agent

메소드/함수: learn

hotexamples.com에서의 예제들: 2

Python Agent.learn - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 src.agent.Agent.learn에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Agent(30)

act(11)

attach_observer(5)

get_parameters_from_file(4)

get_action(3)

load(2)

learn(2)

ep_end(2)

evaluate(2)

get_total_profit(2)

load_dict(1)

is_memory_empty(1)

initialize_total_assets(1)

generate(1)

get_nodes_for_graph(1)

experience_replay(1)

eval_act(1)

cuda(1)

choose_action(1)

bonus_reward(1)

add_data(1)

load_model(1)

예제 #1

파일 보기

파일: train.py 프로젝트: yrpang/mindspore

    agent = Agent(**cfg)
    agent.load_dict()

    for episode in range(300):
        s0 = env.reset()
        total_reward = 1
        while True:
            a0 = agent.act(s0)
            s1, r1, done, _ = env.step(a0)

            if done:
                r1 = -1

            agent.put(s0, a0, r1, s1)

            if done:
                break

            total_reward += r1
            s0 = s1
            agent.learn()
        agent.load_dict()
        print("episode", episode, "total_reward", total_reward)

    path = os.path.realpath(args.ckpt_path)
    if not os.path.exists(path):
        os.makedirs(path)

    ckpt_name = path + "/dqn.ckpt"
    save_checkpoint(agent.policy_net, ckpt_name)

예제 #2

파일 보기

    position_bounds,  # Position bounds
    velocity_bounds  # Velocity bounds
)

# Instanced Agent
agent = Agent(
    policy,  # NeuralNetwork class
    model,
    actions,  # Actions array (after discretization)
    episodes,  # Max number of episodes
    epoches,  # Max number of epoches per episode
    greed_factor  # Greed factor
)

# Getting the result array of len(episodes) length
results = agent.learn()

# Success episodes
success_results = [x for x in results if x["state"][0] >= position_bounds[1]]

# Writing the log file
log_file = open("model.log", "w+")

log_file.write("Episodes: {0}\n".format(episodes))
log_file.write("Epoches: {0}\n".format(epoches))
log_file.write("Epsilon-Greedy: {0}\n".format(greed_factor))
log_file.write("\n------- {0} successful episodes -------\n\n".format(
    len(success_results)))

for r in success_results:
    log_file.write(str(r) + "\n")