Exemplos de Algorithm.choose_action em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: algorithm

Classe / Tipo: Algorithm

Método / Função: choose_action

Exemplos em hotexamples.com: 1

Algorithm.choose_action em Python - 1 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de algorithm.Algorithm.choose_action em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Métodos Frequentes

Exibir Ocultar

Algorithm(30)

__init__(15)

distribute(14)

calculateShortestPath(2)

calculate_turnover2(2)

best_move(2)

apply_algorithm(2)

constructSpanningTree(2)

determine_language(2)

distance_to_time(2)

evolve(2)

parse_game_state(2)

polar_angle_sort(2)

next_step(1)

create_permutations(1)

segments(1)

choose_action(1)

client_token(1)

close(1)

save(1)

compute_center_pin(1)

compute_center_slit(1)

compute_completion_time(1)

rmse(1)

printFittest(1)

create_flow(1)

deleteLater(1)

custom_normalize(1)

min_point(1)

predictor(1)

population(1)

numberOfIterations(1)

generate_actions(1)

get_item_matrix(1)

get_user_matrix(1)

init(1)

initiate(1)

lastmodel(1)

load(1)

check_satisfiability(1)

AddEquity(1)

calculate_shortest_path(1)

Gaussian_blur(1)

Scaling(1)

SA(1)

Rotation(1)

MAXGEN(1)

KL_UCB(1)

Horizontal_wave(1)

FileCrypt(1)

Métodos Frequentes

Algorithm (30)

__init__ (15)

distribute (14)

calculateShortestPath (2)

calculate_turnover2 (2)

best_move (2)

apply_algorithm (2)

constructSpanningTree (2)

determine_language (2)

distance_to_time (2)

Métodos Frequentes

evolve (2)

parse_game_state (2)

polar_angle_sort (2)

next_step (1)

create_permutations (1)

segments (1)

choose_action (1)

client_token (1)

close (1)

save (1)

compute_center_pin (1)

compute_center_slit (1)

compute_completion_time (1)

rmse (1)

printFittest (1)

create_flow (1)

deleteLater (1)

custom_normalize (1)

min_point (1)

predictor (1)

Métodos Frequentes

compute_center_pin (1)

compute_center_slit (1)

compute_completion_time (1)

rmse (1)

printFittest (1)

create_flow (1)

deleteLater (1)

custom_normalize (1)

min_point (1)

predictor (1)

population (1)

numberOfIterations (1)

generate_actions (1)

get_item_matrix (1)

get_user_matrix (1)

init (1)

initiate (1)

lastmodel (1)

load (1)

check_satisfiability (1)

AddEquity (1)

calculate_shortest_path (1)

Gaussian_blur (1)

Scaling (1)

SA (1)

Rotation (1)

MAXGEN (1)

KL_UCB (1)

Horizontal_wave (1)

FileCrypt (1)

Métodos Frequentes

population (1)

numberOfIterations (1)

generate_actions (1)

get_item_matrix (1)

get_user_matrix (1)

init (1)

initiate (1)

lastmodel (1)

load (1)

check_satisfiability (1)

AddEquity (1)

calculate_shortest_path (1)

Gaussian_blur (1)

Scaling (1)

SA (1)

Rotation (1)

MAXGEN (1)

KL_UCB (1)

Horizontal_wave (1)

FileCrypt (1)

Sharpening (1)

Double_wave (1)

Convolution2D (1)

Concave_effect (1)

Color_space_convert (1)

Bluring (1)

Bilateral_blur (1)

SetWarmUp (1)

StringCrypt (1)

calc_scroll (1)

add_trait (1)

bubble_sort (1)

booksData (1)

bestMoveTroops (1)

bestAttack (1)

bestAddingTroops (1)

advance (1)

add_country (1)

StringDecrypt (1)

addResultToLayer (1)

Exemplo n.º 1

0

Exibir arquivo

Arquivo: agent.py Projeto: mtj11167/RL

class agent(): def __init__(self, env, act_dim, state_dim, memory_capacity, epsilon, update_target): self.env = env self.algo = Algorithm(0.0001, 0.99, act_dim, state_dim, memory_capacity, epsilon, 64) self.memory_capacity = memory_capacity self.update_target = update_target def learn(self, epoch): reward_list = [] plt.ion() fig, ax = plt.subplots() for i in range(epoch): state = self.env.reset() ep_reward = 0 while True: self.env.render() action = self.algo.choose_action(state) next_state, reward, done, _ = self.env.step(action) ep_reward += reward self.algo.store_transition(state, action, reward, next_state, done) if (self.algo.memory_counter >= self.memory_capacity): self.algo.learn() if epoch % self.update_target == 0: self.algo.sync_target() if done: print("episode: {} , the episode reward is {}".format( i, round(ep_reward, 3))) if done: break state = next_state reward_list.append(ep_reward) ax.set_xlim(0, epoch) ax.plot(reward_list, 'g-', label='total_loss') plt.pause(0.001) self.env.close()