Beispiel #1
0
# +
# For the purpose of generating the notebook in a reproducible way
# logs download has been commented out.
logs = [('logs/deepracer-eval-sim-sample.log', 'sim-sample')]

# logs = cw.download_all_logs(
#     'logs/deepracer-eval-',
#     '/aws/deepracer/leaderboard/SimulationJobs',
#     not_older_than="2019-07-01 07:00",
#     older_than="2019-07-01 12:00"
# )
# -

# Loads all the logs from the above time range
bulk = slio.load_a_list_of_logs(logs)

# ## Parse logs and visualize
#
# You will notice in here that reward graps are missing, as are many others from the training. These have been trimmed down for clarity.
#
# Do not get tricked though - this notebook provides features that the training one doesn't have, such as batch visualisation of race submission laps.
#
# Side note: Evaluation/race logs contain a reward field but it's not connected to your reward. It is there most likely to ensure logs have consistent structure to make their parsing easier. The value appears to be dependand on distance of the car from the centre of the track. As such it provides no value and is not visualised in this notebook.

# +
simulation_agg = au.simulation_agg(bulk,
                                   'stream',
                                   add_timestamp=True,
                                   is_eval=True)
complete_ones = simulation_agg[simulation_agg['progress'] == 100]
Beispiel #2
0
# * decision about throttle (speed value from your action space)
# * decision index (value from your action space)
# * reward value
# * is the lap complete
# * are all wheels on track?
# * progress in the lap
# * closest waypoint
# * track length
# * timestamp
#
# `la.load_data` and then `la.convert_to_pandas` read it and prepare for your usage. Sorting the values may not be needed, but I have experienced under some circumstances that the log lines were not ordered properly.

# +
EPISODES_PER_ITERATION = 20  #  Set to value of your hyperparameter in training

data = slio.load_data(fname)
df = slio.convert_to_pandas(data,
                            episodes_per_iteration=EPISODES_PER_ITERATION)

df = df.sort_values(['episode', 'steps'])
# personally I think normalizing can mask too high rewards so I am commenting it out,
# but you might want it.
# slio.normalize_rewards(df)

#Uncomment the line of code below to evaluate a different reward function
#nr.new_reward(df, track.center_line, 'reward.reward_sample') #, verbose=True)
# -

# ## New reward
#
# Note the last line above: it takes a reward class from log-analysis/rewards, imports it, instantiates and recalculates reward values based on the data from the log. This lets you do some testing before you start training and rule out some obvious things.
# * decision about throttle (speed value from your action space)
# * decision index (value from your action space)
# * reward value
# * is the car going backwards
# * are all wheels on track?
# * progress in the lap
# * closest waypoint
# * track length
# * timestamp
#
# `la.load_data` and then `la.convert_to_pandas` read it and prepare for your usage. Sorting the values may not be needed, but I have experienced under some circumstances that the log lines were not ordered properly.

# + jupyter={"source_hidden": true}
EPISODES_PER_ITERATION = 20  #  Set to value of your hyperparameter in training

data = slio.load_data(fname)
df = slio.convert_to_pandas(data,
                            episodes_per_iteration=EPISODES_PER_ITERATION)

df = df.sort_values(['episode', 'steps'])
# personally I think normalizing can mask too high rewards so I am commenting it out,
# but you might want it.
# slio.normalize_rewards(df)

#Uncomment the line of code below to evaluate a different reward function
#nr.new_reward(df, l_center_line, 'reward.reward_sample') #, verbose=True)

# + jupyter={"source_hidden": true}
simulation_agg = au.simulation_agg(df)
au.analyze_training_progress(simulation_agg, title='Training progress')