Python Actor.get_action_for_trainの例

プログラミング言語: Python

名前空間/パッケージ名: actor

クラス/型: Actor

メソッド/関数: get_action_for_train

hotexamples.comのコード掲載数: 1

Python Actor.get_action_for_train - 1件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのactor.Actor.get_action_for_trainの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Actor(30)

__init__(30)

eval(11)

choose_action(4)

get_will_save(4)

get_reflex_save(4)

get_fortitude_save(4)

get_action(4)

forward(4)

draw(4)

take_damage(4)

add_movie(4)

act(3)

from_SQLiteRow(3)

action(3)

build(2)

get_actions(2)

get_base_attack_bonus(2)

get_full_attack(2)

create_actor_model(2)

copy_weights(2)

from_string(2)

characterid(2)

mat_name(1)

get_details(1)

setstate(1)

route(1)

push_task(1)

get_alignment_var(1)

get_attack_bonus(1)

get_attack_damage(1)

position(1)

get_base_attribute_score(1)

get_date_debut(1)

get_filter(1)

move_to(1)

get_first_name(1)

act_one_episode(1)

get_full_name(1)

get_rect(1)

mover(1)

_from_string(1)

move_towards(1)

has_private_parking(1)

isAlive(1)

get_action_for_train(1)

attack(1)

getArtistByid(1)

decayEligibilities(1)

build_train_op(1)

コード例 #1

ファイルを表示

    # batch train
    total_reward = 0
    env.reset()
    action = env.action_space.sample()
    state, reward, done, _ = env.step(action)
    for _ in range(1000):
        # training
        states, actions, rewards, next_states = memory.sample(20)
        next_actions = actor.get_actions(next_states)
        next_qs = critic.get_qs(next_states, next_actions)
        loss, q = critic.train(states, actions, rewards, next_qs)
        action_gradients = critic.get_action_gradients(states, actions)
        actor.train(states, action_gradients[0])

        env.render()
        action = actor.get_action_for_train(state, ep)
        next_state, reward, done, _ = env.step(action)
        memory.add((state, action, reward, next_state))
        # print(state, action, reward, next_state)
        total_reward += reward
        # print(action, reward, total_reward)
        state = next_state
        if done:
            break
    # if ep % 10 == 0:
    # critic.update_network_params()
    logging.info('Episode: {}'.format(ep) +
                 ' Total Reward: {:.4f}'.format(total_reward) +
                 ' Q: {:.4f}'.format(np.max(q)) +
                 ' loss: {:.4f}'.format(loss))