Python MonteCarloTreeSearch 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: alphazero.alphazero.mcts

클래스/타입: MonteCarloTreeSearch

hotexamples.com에서의 예제들: 3

Python MonteCarloTreeSearch - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 alphazero.alphazero.mcts.MonteCarloTreeSearch에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

MonteCarloTreeSearch(2)

get_policy(1)

reset(1)

예제 #1

파일 보기

파일: alphazero.py 프로젝트: kaichengyan/alphazero

 def __init__(self, game: Game, state_encoder: GameStateEncoder,
              nn: torch.nn.Module, config: Dict[str, Any]) -> None:
     super().__init__()
     self.game = game
     self.mcts = MonteCarloTreeSearch(game=game,
                                      state_encoder=state_encoder,
                                      nn=nn,
                                      config=config)

예제 #2

파일 보기

파일: alphazero.py 프로젝트: kaichengyan/alphazero

class AlphaZeroArgMaxAgent(Agent):
    def __init__(self, game: Game, state_encoder: GameStateEncoder,
                 nn: torch.nn.Module, config: Dict[str, Any]) -> None:
        super().__init__()
        self.game = game
        self.mcts = MonteCarloTreeSearch(game=game,
                                         state_encoder=state_encoder,
                                         nn=nn,
                                         config=config)

    def select_move(self, state: GameState) -> Move:
        policy = self.mcts.get_policy(state, temperature=0)
        move_index = np.random.choice(self.game.action_space_size, p=policy)
        return self.game.index_to_move(move_index)

    def reset(self) -> None:
        self.mcts.reset()

예제 #3

파일 보기

파일: play_tictactoe_az.py 프로젝트: kaichengyan/alphazero

config['device'] = 'cuda' if torch.cuda.is_available() else 'cpu'


def read_move(player: TicTacToePlayer) -> TicTacToeMove:
    x, y = input(f"{player.name} move: ").split()
    x, y = int(x), int(y)
    return TicTacToeMove(x, y)


if __name__ == '__main__':
    game = TicTacToeGame(config['game_size'])
    state_encoder = TicTacToeStateEncoder(config['device'])

    net = dual_resnet(game, config)
    mcts = MonteCarloTreeSearch(game=game,
                                state_encoder=state_encoder,
                                nn=net,
                                config=config)

    net.load_state_dict(
        torch.load(os.path.join('pretrained', 'ttt_dualres_comp.pth')))
    # net.load_state_dict(torch.load(os.path.join(config['log_dir'], 'best.pth')))
    net.eval()
    agent = AlphaZeroArgMaxAgent(game, state_encoder, net, config)

    agent_role = random.choice([TicTacToePlayer.X, TicTacToePlayer.O])

    while not game.is_over:
        game.show_board()
        # print(f"current state score by eval func: {agent.eval_fn(game.state, agent.player)}")
        if game.current_player == agent_role:
            move = read_move(game.current_player)