Python ReinforceLearning примеры использования

Язык программирования: Python

Пространство имен/Пакет: RL_Module_Velocity_MIMO_SMDP

Класс/Тип: ReinforceLearning

Примеров на hotexamples.com: 2

Python ReinforceLearning - 2 примера найдено. Это лучшие примеры Python кода для RL_Module_Velocity_MIMO_SMDP.ReinforceLearning, полученные из open source проектов. Вы можете ставить оценку каждому примеру, чтобы помочь нам улучшить качество примеров.

Основные методы

Показать Скрыть

x1(4)

autosave(3)

ReinforceLearning(1)

action_selection(1)

eval(1)

eval_feedback(1)

matrix_update(1)

next_eval(1)

user_actions(1)

user_matrices(1)

user_states(1)

Пример #1

Показать файл

    def reset(self):
        """
        Description
             -----
                Resets the PID input trajectory.

        """
        self.u = []


if __name__ == "__main__":

    # Build RL Objects
    rl = ReinforceLearning(discount_factor=0.95, states_start=300, states_stop=340, states_interval=0.5,
                           actions_start=-15, actions_stop=15, actions_interval=2.5, learning_rate=0.5,
                           epsilon=0.2, doe=1.2, eval_period=30)

    # Building states for the problem, states will be the tracking errors
    states = []

    rl.x1 = np.zeros(20)
    rl.x1[0:16] = np.linspace(-20, 0, 16)
    rl.x1[16:20] = np.linspace(2, 12, 4)

    rl.x2 = np.zeros(20)
    rl.x2[0:3] = np.linspace(-5, 5, 3)
    rl.x2[3:20] = np.linspace(6, 28, 17)

    for x1 in rl.x1:
        for x2 in rl.x2:

Пример #2

Показать файл

Файл: Distillate_Case.py Проект: RuiNian7319/Woodberry_Distillation

    def reset(self):
        """
        Description
             -----
                Resets the PID input trajectory.

        """
        self.u = []


if __name__ == "__main__":

    # Build RL Objects
    rl = ReinforceLearning(discount_factor=0.95, states_start=300, states_stop=340, states_interval=0.5,
                           actions_start=-15, actions_stop=15, actions_interval=2.5, learning_rate=0.5,
                           epsilon=0.2, doe=1.2, eval_period=15, beta=0.04)

    # Building states for the problem, states will be the tracking errors
    states = []

    rl.x1 = np.zeros(41)
    rl.x1[0:41] = np.linspace(-20, 20, 41)

    # rl.x2 = np.zeros(28)
    # rl.x2[0:28] = np.linspace(-20, 20, 28)

    rl.x2 = np.zeros(1)

    for x1 in rl.x1:
        for x2 in rl.x2: