Python SoccerEnvの例

プログラミング言語: Python

名前空間/パッケージ名: soccer

クラス/型: SoccerEnv

hotexamples.comのコード掲載数: 7

Python SoccerEnv - 7件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのsoccer.SoccerEnvの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

SoccerEnv(5)

encode_action(5)

encode_state(3)

decode_action(2)

step(2)

close(1)

decode_reward(1)

done(1)

render(1)

reset(1)

reward(1)

transitions(1)

コード例 #1

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_transition2():
    initial_state = SoccerEnv.encode_state(0,3,0,1,False)
    action = SoccerEnv.encode_action(SoccerEnv.Action.S, SoccerEnv.Action.N)
    env = SoccerEnv()
    transitions = env.P[initial_state][action]
    expected_next_state = SoccerEnv.encode_state(1,3,0,1,False)
    for prob, next_state, reward, done in transitions:
        assert next_state == expected_next_state
        assert reward == 0
        assert done == 0

コード例 #2

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_render():
    env = SoccerEnv()
    env.render()
    action = env.encode_action(SoccerEnv.Action.Stick, SoccerEnv.Action.Stick)
    env.step(action)
    env.render()
    return

コード例 #3

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_reward():
    assert SoccerEnv.reward((0, 2, 0, 1, 1)) == 0
    assert SoccerEnv.reward((0, 3, 0, 1, 1)) == -100
    assert SoccerEnv.reward((0, 0, 0, 1, 1)) == 100
    assert SoccerEnv.reward((0, 0, 0, 1, 0)) == 0
    assert SoccerEnv.reward((0, 0, 0, 1, 0)) == 0
    assert SoccerEnv.reward((0, 0, 0, 0, 0)) == 100
    assert SoccerEnv.reward((0, 0, 0, 3, 0)) == -100

コード例 #4

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_done():
    assert SoccerEnv.done((0, 2, 0, 1, 1)) is False
    assert SoccerEnv.done((0, 3, 0, 1, 1))
    assert SoccerEnv.done((0, 0, 0, 1, 1))
    assert SoccerEnv.done((0, 0, 0, 1, 0)) is False
    assert SoccerEnv.done((0, 0, 0, 1, 0)) is False
    assert SoccerEnv.done((0, 0, 0, 0, 0))
    assert SoccerEnv.done((0, 0, 0, 3, 0))

コード例 #5

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_transitions():
    transitions = SoccerEnv.transitions(0, 2, 0, 1, 0, SoccerEnv.Action.W, SoccerEnv.Action.Stick)
    expected_states = set([SoccerEnv.encode_state(0, 2, 0, 1, 0), SoccerEnv.encode_state(0, 2, 0, 1, 1)])
    assert len(transitions) == 2
    for next_state, reward, done in transitions:
        assert next_state in expected_states
        assert reward == 0
        assert done == 0

    state = SoccerEnv.encode_state(0, 2, 0, 1, 0)
    action = SoccerEnv.encode_action(SoccerEnv.Action.W, SoccerEnv.Action.Stick)
    env = SoccerEnv()
    transitions = env.P[state][action]
    assert len(transitions) == 2
    for prob, next_state, reward, done in transitions:
        assert abs(prob - 0.5) < 0.001
        assert next_state in expected_states
        assert reward == 0
        assert done == 0

コード例 #6

0

ファイルを表示

ファイル: tests.py プロジェクト: GMDennis/soccerenv

def test_action_encode():
    env = SoccerEnv()
    action1, action2 = 1, 2
    x = env.encode_action(1,2)
    assert (action1, action2) == env.decode_action(x)

コード例 #7

0

ファイルを表示

ファイル: scratch.py プロジェクト: cesleem/repro-papers-reinforcement-learning

#%% md

## Step 0: Soccer Env Setup

#%%

import sys
sys.path.insert(
    0,
    '/Users/cesleemontgomery/masters/cs7642/projects/7642Fall2019cmontgomery38/project3'
)
from soccer import SoccerEnv

#%%

env = SoccerEnv()

# n_episodes_MAX = 8*10**5 #10*10**5
# steps_MAX = 100
# verbose = False
# alpha = 0.01
#
# num_states = env.observation_space.n
# num_actions = env.action_space.n
# num_individual_actions = int(
#     math.sqrt(env.action_space.n))  # individual player action space is 5, joint action space is 25
#
num_states = env.observation_space.n
num_actions = env.action_space.n
num_individual_actions = int(
    math.sqrt(env.action_space.n)