Python CartPoleEnv.CartPoleEnv 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: gym.envs.classic_control.cartpole

클래스/타입: CartPoleEnv

메소드/함수: CartPoleEnv

hotexamples.com에서의 예제들: 5

Python CartPoleEnv.CartPoleEnv - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 gym.envs.classic_control.cartpole.CartPoleEnv.CartPoleEnv에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

CartPoleEnv(5)

__init__(2)

_configure(2)

_render(2)

_reset(2)

_seed(2)

_step(2)

reset(2)

render(1)

step(1)

예제 #1

파일 보기

    def __init__(self):
        super(PredictActionsCartpoleEnv, self).__init__()
        self.cartpole = CartPoleEnv()

        self.observation_space = self.cartpole.observation_space
        self.action_space = spaces.Tuple(
            (self.cartpole.action_space, ) * (NUM_PREDICTED_ACTIONS + 1))

예제 #2

파일 보기

파일: predict_obs_cartpole.py 프로젝트: icml-anonymous/symnet

    def __init__(self):
        super(PredictObsCartpoleEnv, self).__init__()
        self.cartpole = CartPoleEnv()

        self.observation_space = self.cartpole.observation_space
        self.action_space = spaces.Tuple((self.cartpole.action_space, ) +
                                         (self.cartpole.observation_space, ) *
                                         (NUM_PREDICTED_OBSERVATIONS))

예제 #3

파일 보기

def cartpole_env(env_id=1, **kwargs):
    return GymEnvWrapper(CartPoleEnv(**kwargs), act_null_value=0)

예제 #4

파일 보기

파일: cartpole.py 프로젝트: nnaisense/MAGE

import gym
import numpy as np
import math
from gym.envs.classic_control.cartpole import CartPoleEnv

from envs.task import Task

_env = CartPoleEnv()
_X_THRESHOLD = _env.x_threshold
_THETA_THRESHOLD = _env.theta_threshold_radians
del _env


class CartPoleBalanceTask(Task):
    def __call__(self, states, actions, next_states):
        next_dones = GYMMB_ContinuousCartPole.is_done(next_states)
        dones = GYMMB_ContinuousCartPole.is_done(states)
        rewards = 1.0 - next_dones.float() * dones.float()  # basically, you get always 1.0 unless you exceed the is_done termination criteria
        return rewards


class CartPoleSpeedyBalanceTask(Task):
    def __call__(self, states, actions, next_states):
        rewards = states[:, 1].abs()  # more rewards for abs velocity. No free points
        return rewards


class GYMMB_ContinuousCartPole(CartPoleEnv):
    """
    A continuous version.
    Observation space:

예제 #5

파일 보기

파일: main.py 프로젝트: ChainBreak/rl_practice

        logging.debug("actions: %s\n%s", action_tensor.shape, action_tensor)
        logging.debug("log_prob: %s\n%s", log_prob_tensor.shape,
                      log_prob_tensor)
        logging.debug("log_prob test: %s\n%s", prob_tensor.shape,
                      torch.log(prob_tensor))
        logging.debug("rewards: %s\n%s", rewards_tensor.shape, rewards_tensor)
        logging.debug("loss: %s\n%s", loss.shape, loss)


if __name__ == "__main__":

    #Create environment. Note that make actually wraps the actual environment.
    #The wrapper will end an episode based on a time limit
    # env = gym.make("CartPole-v0") #state = [x,x_dot,theta,theta_dot], actions = [left_or_right] 0 = left 1 = right

    env = CartPoleEnv()  #instanciate the env directly, without the wrapper

    #get the state and action size from the environement
    state_size = env.observation_space.shape[0]
    action_size = env.action_space.n

    #instanciate agent
    agent = PolicyGradientAgent(state_size, action_size)

    done = True
    last_render_time = time.time()
    episode_count = -1
    #forever
    while True:

        #if episode is done