Python GridWorld.get_reward примеры использования

Язык программирования: Python

Пространство имен/Пакет: gridworld

Класс/Тип: GridWorld

Метод/Функция: get_reward

Примеров на hotexamples.com: 1

Python GridWorld.get_reward - 1 пример найден. Это лучшие примеры Python кода для gridworld.GridWorld.get_reward, полученные из open source проектов. Вы можете ставить оценку каждому примеру, чтобы помочь нам улучшить качество примеров.

Основные методы

Показать Скрыть

GridWorld(30)

gridsize(5)

set_ideal_grid(5)

height(4)

width(4)

load(3)

get_expert_action(3)

move(3)

perform_action(2)

get_state_data(2)

get_surroundings(2)

__init__(2)

get_cell(2)

draw(2)

end(2)

add_goal(2)

add_start(2)

process_events(2)

place_exit(1)

_fill_rect(1)

__move__(1)

grid_coordinates_to_indices(1)

act(1)

q_learning(1)

is_terminal(1)

play(1)

load_state_data(1)

loop(1)

min_remaining_moves(1)

get_starting_position(1)

plot_policy(1)

move_dir(1)

get_state(1)

action_space_sample(1)

get_s0(1)

draw_shape(1)

add_horizontal_wall(1)

add_trap(1)

add_vertical_wall(1)

available_actions(1)

create(1)

create_agents(1)

do_action(1)

draw_path(1)

evaluate(1)

get_reward(1)

generate(1)

generate_states(1)

getActions(1)

getStates(1)

Пример #1

Показать файл

Файл: qlearn.py Проект: rahular/rl


if __name__ == '__main__':
    max_steps = 100
    max_iters = 1000
    seed = random.randint(0, 100)
    agent = qAgent()
    grid = GridWorld(size=8, force_fast=True, seed=seed)
    grid.show()
    print()
    for iter in range(max_iters):
        agent.set_grid(grid)
        i, j = 0, 0  # initial state
        cum_reward = 0
        for step in range(max_steps):
            action = agent.get_action(i, j)
            new_i, new_j = grid.move(i, j, action)
            reward, is_final = grid.get_reward(i, j)
            cum_reward += reward
            agent.update_q(i, j, new_i, new_j, action, reward)
            if is_final:
                break
            i = new_i
            j = new_j
        if iter % 100 == 0:
            print(
                'Episode {} finished after {} steps with cumulative reward of {}'
                .format(iter, step, cum_reward))
        grid = GridWorld(size=8, force_fast=True, seed=seed)
    print()
    show_qtable(agent, grid.size)