Python DQNAgent.Q_values Examples

Programming Language: Python

Namespace/Package Name: dqn

Class/Type: DQNAgent

Method/Function: Q_values

Examples at hotexamples.com: 1

Python DQNAgent.Q_values - 1 examples found. These are the top rated real world Python examples of dqn.DQNAgent.Q_values extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

DQNAgent(30)

act(13)

load(11)

compile(8)

fit(5)

save(5)

train(5)

replay(5)

test(4)

save_weights(4)

remember(4)

get_action(4)

load_model(4)

actDeterministically(4)

epsilon(3)

save_model(3)

load_weights(3)

target_model(2)

observe(2)

start(2)

get_last_observations(2)

end(2)

train_one_episode(1)

train_model(1)

trainAgent(1)

train_only(1)

update_epoch(1)

update_replay_memory(1)

test_one_episode(1)

test_model(1)

update_target(1)

store_transition(1)

train_rnn(1)

testAgent(1)

update_target_model(1)

train_vae(1)

training(1)

restart_epoch(1)

store_experience(1)

load_state_dict(1)

__init__(1)

act_2(1)

append_sample(1)

backword(1)

fill_memory(1)

get_test_loss(1)

learn(1)

loss(1)

step(1)

parameters(1)

Example #1

Show file

                state_t_1, reward_t, terminal = env.step(action_t)
                total_reward += reward_t

                # store experience
                agent.store_experience(state_t, action_t, reward_t, state_t_1,
                                       terminal)
                print(agent.tmp_q_values, np.argmax(agent.tmp_q_values),
                      agent.enable_actions.index(action_t))

                # for log
                frame += 1
                steps += 1

                if steps > warmup:
                    loss += agent.current_loss
                    Q_max += np.max(agent.Q_values([state_t]))

            # experience replay
            # warmup中は学習しない
            if steps > warmup:
                agent.backword()

                if steps % n_update_target_network:
                    agent.update_target()

            print(
                "epoch: {:03d}/{:03d} |  loss: {:.4f} | Q_max: {:.4f} | total reward: {} | steps: {}"
                .format(e, n_epochs - 1, loss / frame, Q_max / frame,
                        total_reward, steps))

    except KeyboardInterrupt: