Python NatureQN 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: q3_nature

클래스/타입: NatureQN

hotexamples.com에서의 예제들: 5

Python NatureQN - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 q3_nature.NatureQN에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

NatureQN(3)

evaluate(2)

initialize(2)

initialize_eval(1)

original_schedule(1)

record(1)

run(1)

예제 #1

파일 보기

    config.lwf = args.lwf

    config.lwf_loss = args.lwf_loss

    config.lwf_weight = args.lwf_weight

    config.num_old_actions = int(args.num_old_actions)

    return config

def print_config(config):
    print 'Current config:\n'
    variables = zip(vars(config).keys(), vars(config).values())
    for var, val in sorted(variables):
        print var + ' = ' + str(val)


if __name__ == '__main__':
    args = parse_args()
    my_config = modify_config(args)
    print_config(my_config)
    with tf.device('/gpu:' + str(args.gpu)):
        # make env
        env = gym.make(my_config.env_name)
        env = wrap_dqn(env)
        model = NatureQN(env, my_config)
        model.initialize_eval()
        model.evaluate()
        if my_config.record:
            model.record()

예제 #2

파일 보기

파일: q5_train_atari_nature.py 프로젝트: jaentrouble/rein_assignment

of the training is to use Tensorboard. The starter code writes summaries of different
variables.

To launch tensorboard, open a Terminal window and run 
tensorboard --logdir=results/
Then, connect remotely to 
address-ip-of-the-server:6006 
6006 is the default port used by tensorboard.
"""
if __name__ == '__main__':
    # make env
    starttime = time.time()
    env = gym.make(config.env_name)
    env = MaxAndSkipEnv(env, skip=config.skip_frame)
    env = PreproWrapper(env,
                        prepro=greyscale,
                        shape=(80, 80, 1),
                        overwrite_render=config.overwrite_render)

    # exploration strategy
    exp_schedule = LinearExploration(env, config.eps_begin, config.eps_end,
                                     config.eps_nsteps)

    # learning rate schedule
    lr_schedule = LinearSchedule(config.lr_begin, config.lr_end,
                                 config.lr_nsteps)

    # train model
    model = NatureQN(env, config)
    model.run(exp_schedule, lr_schedule)
    print('Total render time : {:.2f}'.format(time.time() - starttime))

예제 #3

파일 보기

파일: q5_train_atari_nature.py 프로젝트: lpierezan/rl_cs234

of the training is to use Tensorboard. The starter code writes summaries of different
variables.

To launch tensorboard, open a Terminal window and run 
tensorboard --logdir=results/
Then, connect remotely to 
address-ip-of-the-server:6006 
6006 is the default port used by tensorboard.
"""
if __name__ == '__main__':
    # make env
    env = gym.make(config.env_name)
    env = MaxAndSkipEnv(env, skip=config.skip_frame)
    env = PreproWrapper(env, prepro=greyscale, shape=(80, 80, 1), 
                        overwrite_render=config.overwrite_render)

    # exploration strategy
    exp_schedule = LinearExploration(env, config.eps_begin, 
            config.eps_end, config.eps_nsteps)

    # learning rate schedule
    lr_schedule  = LinearSchedule(config.lr_begin, config.lr_end,
            config.lr_nsteps)

    # train model
    model = NatureQN(env, config)
    model.original_schedule = config.original_schedule
    model.logger.info('original schedule: {}'.format(model.original_schedule))
    
    model.run(exp_schedule, lr_schedule)

예제 #4

파일 보기

파일: create_demos.py 프로젝트: ajoshi80/imitationlearning

env = PreproWrapper(env, prepro=greyscale, shape=(80, 80, 1),
                    overwrite_render=config.overwrite_render)

rewards = []

experts_meta_lis = [
    './core/checkpoints/q_learning/skip_connection/q5_train_atari_nature/deepdqn_weights/.meta', './core/checkpoints/q_learning/skip_connection/q5_train_atari_nature/resnet_weights/.meta', './core/checkpoints/policy_gradients/policy_network.ckpt.meta']
experts_chkpt_lis = [
    './core/checkpoints/q_learning/skip_connection/q5_train_atari_nature/deepdqn_weights/', './core/checkpoints/q_learning/skip_connection/q5_train_atari_nature/resnet_weights/', './core/checkpoints/policy_gradients/policy_network.ckpt']
experts = []

#temp_sess = None
for meta_path, chkpt_path in zip(experts_meta_lis, experts_chkpt_lis):
    print([n.name for n in tf.get_default_graph().as_graph_def().node])
    if "deepdqn" in meta_path:
        model = NatureQN(env, config)
    if "resnet" in meta_path:
        model = ResnetQN(env, config)
    if "policy" in meta_path:
        continue
    # if temp_sess == None:
    #temp_sess = model.sess
    model.initialize(meta_path, chkpt_path)
    experts.append(model)
    # with model.graph.as_default():

print("LOADED ALL MODELS")

for i in range(len(experts)):
    guide = experts[i]
    guide_experience = [[]]

예제 #5

파일 보기

If so, please report your hyperparameters.

You'll find the results, log and video recordings of your agent every 250k under
the corresponding file in the results folder. A good way to monitor the progress
of the training is to use Tensorboard. The starter code writes summaries of different
variables.

To launch tensorboard, open a Terminal window and run 
tensorboard --logdir=results/
Then, connect remotely to 
address-ip-of-the-server:6006 
6006 is the default port used by tensorboard.
"""
if __name__ == '__main__':
    # make env
    env = gym.make(config.env_name)
    env = MaxAndSkipEnv(env, skip=config.skip_frame)
    env = PreproWrapper(env,
                        prepro=greyscale,
                        shape=(80, 80, 1),
                        overwrite_render=config.overwrite_render)

    # load model
    model = NatureQN(env, config)
    model.initialize()
    loaded = load_model(model)
    assert loaded != False, "Loading failed"

    # evaluate one episode of data
    model.evaluate(env, 1)