Python DQNAgent.process_state_batch Examples

Programming Language: Python

Namespace/Package Name: rl.agents.dqn

Class/Type: DQNAgent

Method/Function: process_state_batch

Examples at hotexamples.com: 1

Python DQNAgent.process_state_batch - 1 examples found. These are the top rated real world Python examples of rl.agents.dqn.DQNAgent.process_state_batch extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

DQNAgent(30)

compile(30)

load_weights(30)

fit(30)

save_weights(30)

test(30)

forward(7)

processor(3)

target_model(3)

compute_batch_q_values(3)

compute_q_values(2)

test_policy(2)

backward(2)

training(2)

policy(2)

select_action(1)

save_model(1)

reset_states(1)

replay(1)

remember(1)

reload_memory(1)

reload(1)

model(1)

process_state_batch(1)

modelfile(1)

X(1)

memoryfile(1)

learning(1)

get_config(1)

enable_dueling_network(1)

cmopile(1)

act(1)

_build_model(1)

__init__(1)

Y(1)

update_target_model(1)

Example #1

Show file

            # Start by extracting the necessary parameters (we use a vectorized implementation).
            state0_seq = []
            # state1_seq = []
            reward_seq = []
            action_seq = []
            terminal1_seq = []

            for e in experiences:
                state0_seq.append(e.state0)
                # state1_seq.append(e.state1)
                reward_seq.append(e.reward)
                action_seq.append(e.action)
                terminal1_seq.append(e.terminal1)

            state0_seq = dqn.process_state_batch(state0_seq)
            # state1_seq = dqn.process_state_batch(state1_seq)
            reward_seq = np.array(reward_seq)
            action_seq = np.array(action_seq, dtype=np.float32)
            terminal1_seq = np.array(terminal1_seq)

            hidden_states_seq = model_truncated.predict_on_batch(state0_seq)

            hstates[jj, ...] = hidden_states_seq[np.newaxis, :-1, :]
            actions[jj, ...] = action_seq[np.newaxis, :-1, np.newaxis]
            next_hstate[jj, ...] = hidden_states_seq[np.newaxis, -1, :]
            rewards[jj, ...] = reward_seq[np.newaxis, -1]
            terminals[jj, ...] = terminal1_seq[np.newaxis, -1]

        ml_model.fit([hstates, actions], [next_hstate, rewards, terminals],
                     verbose=1,