Пример #1
0
# ### 基本代理程序模拟结果
# 要从最初的模拟程序获得结果,你需要调整下面的标志:
# - `'enforce_deadline'` - 将此标志设定为`True`来强制驾驶代理程序捕获它是否在合理时间内到达目的地。
# - `'update_delay'` - 将此标志设定为较小数值(比如`0.01`)来减少每次试验中每步之间的时间。
# - `'log_metrics'` - 将此标志设定为`True`将模拟结果记录为在`/logs/`目录下的`.csv`文件。
# - `'n_test'` - 将此标志设定为`'10'`则执行10次测试试验。
# 
# 可选的,你还可以通过将`'display'`标志设定为`False`来禁用可视化模拟(可以使得试验跑得更快)。调试时,设定的标志会返回到他们的默认设定。重要的是要理解每个标志以及它们如何影响到模拟。
# 
# 你成功完成了最初的模拟后(有20个训练试验和10个测试试验),运行下面的代码单元格来使结果可视化。注意运行同样的模拟时,日志文件会被覆写,所以留意被载入的日志文件!在 projects/smartcab 下运行 agent.py 文件。

# In[5]:


# Load the 'sim_no-learning' log file from the initial simulation results
vs.plot_trials('sim_no-learning.csv')


# ### 问题 3
# 利用上面的从你初始模拟中得到的可视化结果,给出关于驾驶代理程序的分析和若干观察。确保对于可视化结果上的每个面板你至少给出一条观察结果。你可以考虑的一些情况:
# - *驾驶代理程序多频繁地做出不良决策?有多少不良决策造成了事故?*
# - *假定代理程序是随机驾驶,那么可靠率是否合理?*
# - *代理程序对于它的行动会获得什么样的奖励?奖励是否表明了它收到严重的惩罚?*
# - *随着试验数增加,结果输出是否有重大变化?*
# - *这个智能出租车对于乘客来说,会被人为是安全的且/或可靠的吗?为什么或者为什么不?*

# -----
# ## 通知驾驶代理程序
# 创建一个优化Q-Learning的驾驶代理程序的第二步,是定义一系列代理程序会在环境中发生的状态。根据输入、感知数据和驾驶代理程序可用的变量,可以为代理程序定义一系列状态,使它最终可以*学习*在一个状态下它需要执行哪个动作。对于每个状态的`'如果这个处于这个状态就那个行动'`的状况称为**策略**,就是最终驾驶代理程序要学习的。没有定义状态,驾驶代理程序就不会明白哪个动作是最优的——或者甚至不会明白它要关注哪个环境变量和条件!

# ### 识别状态
Пример #2
0
# -*- coding: utf-8 -*-
"""
Created on Sun May 21 19:40:11 2017

@author: ECOWIZARD
"""

# Import the visualization code
import visuals as vs

vs.plot_trials('sim_no-learning.csv')
Пример #3
0
def run():
    """ Driving function for running the simulation. 
        Press ESC to close the simulation, or [SPACE] to pause the simulation. """
    # constant = 0.9957
    # alpha = 0.2
    tolerance = 0.01

    for constant in [
            0.0078, 0.0052, 0.0039, 0.0031, 0.0026, 0.0022, 0.0019, 0.0017
    ]:
        for alpha in [0.2, 0.5, 0.8]:
            good_counter = 0
            for n in range(20):
                ##############
                # Create the environment
                # Flags:
                #   verbose     - set to True to display additional output from the simulation
                #   num_dummies - discrete number of dummy agents in the environment, default is 100
                #   grid_size   - discrete number of intersections (columns, rows), default is (8, 6)
                env = Environment(verbose=True)
                ##############
                # Create the driving agent
                # Flags:
                #   learning   - set to True to force the driving agent to use Q-learning
                #    * epsilon - continuous value for the exploration factor, default is 1
                #    * alpha   - continuous value for the learning rate, default is 0.5
                agent = env.create_agent(
                    LearningAgent,
                    learning=True,
                    alpha=alpha,
                    constant=constant)
                ##############
                # Follow the driving agent
                # Flags:
                #   enforce_deadline - set to True to enforce a deadline metric
                env.set_primary_agent(agent, enforce_deadline=True)
                ##############
                # Create the simulation
                # Flags:
                #   update_delay - continuous time (in seconds) between actions, default is 2.0 seconds
                #   display      - set to False to disable the GUI if PyGame is enabled
                #   log_metrics  - set to True to log trial and simulation results to /logs
                #   optimized    - set to True to change the default log file name
                sim = Simulator(
                    env,
                    update_delay=0,
                    log_metrics=True,
                    display=False,
                    optimized=True)
                ##############
                # Run the simulator
                # Flags:
                #   tolerance  - epsilon tolerance before beginning testing, default is 0.05
                #   n_test     - discrete number of testing trials to perform, default is 0
                sim.run(n_test=100, tolerance=tolerance)

                safety_rating, reliability_rating = plot_trials(
                    'sim_improved-learning.csv')

                if safety_rating in ['A+', 'A'
                                     ] and reliability_rating in ['A', 'A+']:
                    good_counter += 1
                else:
                    break

            f = open('result.txt', 'a+')
            f.write('{}, {}, {}, {}\n'.format(constant, alpha, agent.counter,
                                              good_counter))
            f.close()
Пример #4
0
    ##############
    # Create the simulation
    # Flags:
    # FAI5100 - Updated the time delay between each time step.
    update_delay = 0.01
    #   display      - set to False to disable the GUI if PyGame is enabled

    # FAI5100 - Set this to True to log the simulation results as a .csv file in /logs/.
    log_metrics = True

    #   optimized    - set to True to change the default log file name
    sim = Simulator(env,
                    update_delay=0.01,
                    display=True,
                    log_metrics=True,
                    optimized=True)

    ##############
    # Run the simulator
    # Flags:
    #   tolerance  - epsilon tolerance before beginning testing, default is 0.05

    # FAI5100 - Set the n_test flag to run 10 testing trials.

    sim.run(n_test=10, tolerance=0.001)


if __name__ == '__main__':
    run()
    vs.plot_trials('sim_improved-learning.csv')