Python CuriosityModel 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: mlagents.trainers.components.reward_signals.curiosity.model

클래스/타입: CuriosityModel

hotexamples.com에서의 예제들: 3

Python CuriosityModel - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 mlagents.trainers.components.reward_signals.curiosity.model.CuriosityModel에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

CuriosityModel(3)

자주 사용되는 메소드들

CuriosityModel (3)

예제 #1

파일 보기

파일: signal.py 프로젝트: carlotes247/IGGI19_Imitation_Learning_Workshop

 def __init__(
     self,
     policy: TFPolicy,
     strength: float,
     gamma: float,
     encoding_size: int = 128,
     learning_rate: float = 3e-4,
     num_epoch: int = 3,
 ):
     """
     Creates the Curiosity reward generator
     :param policy: The Learning Policy
     :param strength: The scaling parameter for the reward. The scaled reward will be the unscaled
     reward multiplied by the strength parameter
     :param gamma: The time discounting factor used for this reward.
     :param encoding_size: The size of the hidden encoding layer for the ICM
     :param learning_rate: The learning rate for the ICM.
     :param num_epoch: The number of epochs to train over the training buffer for the ICM.
     """
     super().__init__(policy, strength, gamma)
     self.model = CuriosityModel(
         policy.model, encoding_size=encoding_size, learning_rate=learning_rate
     )
     self.num_epoch = num_epoch
     self.use_terminal_states = False
     self.update_dict = {
         "forward_loss": self.model.forward_loss,
         "inverse_loss": self.model.inverse_loss,
         "update": self.model.update_batch,
     }
     self.has_updated = False

예제 #2

파일 보기

 def __init__(
     self,
     policy: TFPolicy,
     strength: float,
     gamma: float,
     encoding_size: int = 128,
     learning_rate: float = 3e-4,
 ):
     """
     Creates the Curiosity reward generator
     :param policy: The Learning Policy
     :param strength: The scaling parameter for the reward. The scaled reward will be the unscaled
     reward multiplied by the strength parameter
     :param gamma: The time discounting factor used for this reward.
     :param encoding_size: The size of the hidden encoding layer for the ICM
     :param learning_rate: The learning rate for the ICM.
     """
     super().__init__(policy, strength, gamma)
     self.model = CuriosityModel(policy,
                                 encoding_size=encoding_size,
                                 learning_rate=learning_rate)
     self.use_terminal_states = False
     self.update_dict = {
         "curiosity_forward_loss": self.model.forward_loss,
         "curiosity_inverse_loss": self.model.inverse_loss,
         "curiosity_update": self.model.update_batch,
     }
     self.stats_name_to_update_name = {
         "Losses/Curiosity Forward Loss": "curiosity_forward_loss",
         "Losses/Curiosity Inverse Loss": "curiosity_inverse_loss",
     }
     self.has_updated = False

예제 #3

파일 보기

파일: signal.py 프로젝트: alclimb/ml-ex01

 def __init__(self, policy: TFPolicy, settings: CuriositySettings):
     """
     Creates the Curiosity reward generator
     :param policy: The Learning Policy
     :param settings: CuriositySettings object that contains the parameters
         (including encoding size and learning rate) for this CuriosityRewardSignal.
     """
     super().__init__(policy, settings)
     self.model = CuriosityModel(
         policy,
         encoding_size=settings.encoding_size,
         learning_rate=settings.learning_rate,
     )
     self.use_terminal_states = False
     self.update_dict = {
         "curiosity_forward_loss": self.model.forward_loss,
         "curiosity_inverse_loss": self.model.inverse_loss,
         "curiosity_update": self.model.update_batch,
     }
     self.stats_name_to_update_name = {
         "Losses/Curiosity Forward Loss": "curiosity_forward_loss",
         "Losses/Curiosity Inverse Loss": "curiosity_inverse_loss",
     }
     self.has_updated = False