Python DDPG 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: duckietown_rl.ddpg

클래스/타입: DDPG

hotexamples.com에서의 예제들: 4

Python DDPG - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 duckietown_rl.ddpg.DDPG에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

DDPG(4)

load(4)

predict(2)

train(2)

actor_target(1)

critic(1)

critic_target(1)

save(1)

select_action(1)

예제 #1

파일 보기

파일: 3-test-ddpg-cnn.py 프로젝트: krrish94/challenge-aido1_LF1-baseline-RL-sim-pytorch

# Launch the env with our helper function
env = launch_env()

# Wrappers
env = ResizeWrapper(env)
env = NormalizeWrapper(env)
env = ImgWrapper(env)  # to make the images from 160x120x3 into 3x160x120
env = ActionWrapper(env)
# env = DtRewardWrapper(env) # not during testing

state_dim = env.observation_space.shape
action_dim = env.action_space.shape[0]
max_action = float(env.action_space.high[0])

# Initialize policy
policy = DDPG(state_dim, action_dim, max_action, net_type="cnn")

policy.load(file_name, directory="./pytorch_models")

with torch.no_grad():
    while True:
        obs = env.reset()
        env.render()
        rewards = []
        while True:
            action = policy.predict(np.array(obs))
            obs, rew, done, misc = env.step(action)
            rewards.append(rew)
            env.render()
            if done:
                break

예제 #2

파일 보기

    noise_sigma = 0.2
    noise = OUNoise(mu=np.zeros(action_dim), sigma=noise_sigma)

# Load VAE
image_dimensions = 3 * 160 * 120
feature_dimensions = 1000
encoding_dimensions = 40
vae = VAE(image_dimensions, feature_dimensions, encoding_dimensions, 'selu')

if(use_pr):
    replay_buffer = utils.PrioritizedReplayBuffer(args.replay_buffer_max_size)
else:
    replay_buffer = utils.ReplayBuffer(args.replay_buffer_max_size)

# Initialize policy
policy = DDPG(state_dim, action_dim, max_action, replay_buffer, net_type="vae", vae=vae)


# Evaluate untrained policy
evaluations= [evaluate_policy(env, policy)]

exp.metric("rewards", evaluations[0])

total_timesteps = 0
timesteps_since_eval = 0
episode_num = 0
done = True
episode_reward = None
env_counter = 0
while total_timesteps < args.max_timesteps:

예제 #3

파일 보기

파일: 4-train-ddpg-cnn-remote.py 프로젝트: krrish94/challenge-aido1_LF1-baseline-RL-sim-pytorch

# Wrappers
env = ResizeWrapper(env)
env = NormalizeWrapper(env)
env = ImgWrapper(env)  # to make the images from 160x120x3 into 3x160x120
env = ActionWrapper(env)
env = DtRewardWrapper(env)

# Set seeds
seed(args.seed)

state_dim = env.observation_space.shape
action_dim = env.action_space.shape[0]
max_action = float(env.action_space.high[0])

# Initialize policy
policy = DDPG(state_dim, action_dim, max_action, net_type="cnn")

replay_buffer = utils.ReplayBuffer(args.replay_buffer_max_size)

# Evaluate untrained policy
evaluations = [evaluate_policy(env, policy)]

exp.metric("rewards", evaluations[0])

total_timesteps = 0
timesteps_since_eval = 0
episode_num = 0
done = True
episode_reward = None
env_counter = 0
while total_timesteps < args.max_timesteps:

예제 #4

파일 보기

파일: log.py 프로젝트: chingisooinar/End-to-End-Deep-Learning-Approach-for-Autonomous-Driving-DuckieTown

                camera_height=480,
                accept_start_angle_deg=4,
                full_transparency=True,
                distortion=True,
                randomize_maps_on_reset=True,
                draw_curve=False,
                draw_bbox=False,
                frame_skip=4,
                draw_DDPG_features=False)

state_dim = env.get_features().shape[0]
action_dim = env.action_space.shape[0]
max_action = float(env.action_space.high[0])

# Initialize policy
expert = DDPG(state_dim, action_dim, max_action, net_type="dense")
expert.load("model-here",
            directory="../duckietown_rl/pytorch_models",
            for_inference=True)

# Initialize the environment
env.reset()
# Get features(state representation) for RL agent
obs = env.get_features()
EPISODES, STEPS = 20, 1000
DEBUG = False

# please notice
logger = Logger(env, log_file=f'train-{int(EPISODES*STEPS/1000)}k.log')

start_time = time.time()