Python FindBestController 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: deeprl.experiment.base_controllers

메소드/함수: FindBestController

hotexamples.com에서의 예제들: 2

Python FindBestController - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 deeprl.experiment.base_controllers.FindBestController에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

evaluateOn='action',
periodicity=1,
resetEvery='none'))

# We wish to discover, among all versions of our neural network (i.e., after every training epoch), which one
# seems to generalize the better, thus which one has the highest validation score. Here, we do not care about the
# "true generalization score", or "test score".
# To achieve this goal, one can use the FindBestController along with an InterleavedTestEpochControllers. It is
# important that the validationID is the same than the id argument of the InterleavedTestEpochController.
# The FindBestController will dump on disk the validation scores for each and every network, as well as the
# structure of the neural network having the best validation score. These dumps can then used to plot the evolution
# of the validation and test scores (see below) or simply recover the resulting neural network for your
# application.
agent.attach(
bc.FindBestController(validationID=ALE_env.VALIDATION_MODE,
testID=None,
unique_fname=fname))

# All previous controllers control the agent during the epochs it goes through. However, we want to interleave a
# "validation epoch" between each training epoch ("one of two epochs", hence the periodicity=2). We do not want
# these validation epoch to interfere with the training of the agent, which is well established by the
# TrainerController, EpsilonController and alike. Therefore, we will disable these controllers for the whole
# duration of the validation epochs interleaved this way, using the controllersToDisable argument of the
# InterleavedTestEpochController. For each validation epoch, we want also to display the sum of all rewards
# obtained, hence the showScore=True. Finally, we want to call the summarizePerformance method of ALE_env every
# [parameters.period_btw_summary_perfs] *validation* epochs.
agent.attach(
bc.InterleavedTestEpochController(
id=ALE_env.VALIDATION_MODE,
epochLength=parameters.steps_per_test,
controllersToDisable=[0, 1, 2, 3, 4],

예제 #2

파일 보기

파일: run_MG_two_storages.py 프로젝트: qiuz/General_Deep_Q_RL

                             periodicity=1,
                             resetEvery='none'))

    # We wish to discover, among all versions of our neural network (i.e., after every training epoch), which one
    # seems to generalize the better, thus which one has the highest validation score. However we also want to keep
    # track of a "true generalization score", the "test score". Indeed, what if we overfit the validation score ?
    # To achieve these goals, one can use the FindBestController along two InterleavedTestEpochControllers, one for
    # each mode (validation and test). It is important that the validationID and testID are the same than the id
    # argument of the two InterleavedTestEpochControllers (implementing the validation mode and test mode
    # respectively). The FindBestController will dump on disk the validation and test scores for each and every
    # network, as well as the structure of the neural network having the best validation score. These dumps can then
    # used to plot the evolution of the validation and test scores (see below) or simply recover the resulting neural
    # network for your application.
    agent.attach(
        bc.FindBestController(validationID=MG_two_storages_env.VALIDATION_MODE,
                              testID=MG_two_storages_env.TEST_MODE,
                              unique_fname=fname))

    # All previous controllers control the agent during the epochs it goes through. However, we want to interleave a
    # "validation epoch" between each training epoch ("one of two epochs", hence the periodicity=2). We do not want
    # these validation epoch to interfere with the training of the agent, which is well established by the
    # TrainerController, EpsilonController and alike, nor with its testing (see next controller). Therefore, we will
    # disable these controllers for the whole duration of the validation epochs interleaved this way, using the
    # controllersToDisable argument of the InterleavedTestEpochController. For each validation epoch, we want also to
    # display the sum of all rewards obtained, hence the showScore=True. Finally, we never want this controller to call
    # the summarizePerformance method of MG_two_storage_env.
    agent.attach(
        bc.InterleavedTestEpochController(
            id=MG_two_storages_env.VALIDATION_MODE,
            epochLength=parameters.steps_per_test,
            controllersToDisable=[0, 1, 2, 3, 4, 7],