Python ReinforceLearning.autosaveの例

プログラミング言語: Python

名前空間/パッケージ名: RL_Module_Velocity

クラス/型: ReinforceLearning

メソッド/関数: autosave

hotexamples.comのコード掲載数: 2

Python ReinforceLearning.autosave - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのRL_Module_Velocity.ReinforceLearning.autosaveの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

ucb_action_selection(3)

autosave(2)

matrix_update(2)

x1(2)

x2(2)

ReinforceLearning(1)

action_selection(1)

user_actions(1)

user_matrices(1)

user_states(1)

コード例 #1

ファイルを表示

            next_state, Reward, Done, Info = env.step(
                control_input,
                t,
                setpoint=[set_point1, set_point2],
                noise=False,
                economics='mixed',
                w_y1=1,
                w_y2=0)

            # RL Feedback
            if t == rl.eval_feedback and t > 150:
                rl.matrix_update(
                    action_index, Reward, state,
                    [env.y[t, 0] - set_point1, env.y[t, 1] - set_point2], 5)
                tot_reward = tot_reward + Reward

        rlist.append(tot_reward)

        # Autosave Q, T, and NT matrices
        rl.autosave(episode, 100)

        if episode % 10 == 0:
            print("Episode {} | Current Reward {}".format(episode, tot_reward))

    env.plots(timestart=50, timestop=6000)
    # plt.scatter(PID1.u[40:env.y.shape[0]], env.y[40:, 0])
    # plt.show()

    # plt.scatter(PID2.u[40:env.y.shape[0]], env.y[40:, 1])
    # plt.show()

コード例 #2

ファイルを表示

            # Generate input tuple
            control_input = np.array([[input_1, input_2]])

            # Simulate next time
            next_state, Reward, Done, Info = env.step(
                control_input,
                t,
                setpoint=[set_point1, set_point2],
                noise=False,
                economics='distillate')

            # RL Feedback
            if t == rl.eval_feedback:
                rl.matrix_update(
                    action_index, Reward, state,
                    env.y[t, :] - np.array([set_point1, set_point2]), 5)
                tot_reward = tot_reward + Reward

        rlist.append(tot_reward)

        # Autosave Q, T, and NT matrices
        rl.autosave(iteration, 250)

    env.plots(timestart=50, timestop=5950)
    # plt.scatter(PID1.u[40:env.y.shape[0]], env.y[40:, 0])
    # plt.show()

    # plt.scatter(PID2.u[40:env.y.shape[0]], env.y[40:, 1])
    # plt.show()