def testAllPlayer(): #3x3 méretű pályától 6x6 méretű pályáig tesztel for i in range(3, 6): b = Board(i) player1list = [ MiniMaxPlayer(), RandomPlayer(), QLearningPlayer("policy_" + str(i) + "x" + str(i) + "_100000_round_against_RandomPlayer_firstplayer"), QLearningPlayer(), RandomForestClassifierPlayer( "DTC_" + str(i) + "x" + str(i) + "_firstplayer_X", "DTC_" + str(i) + "x" + str(i) + "_firstplayer_Y") ] player2list = [ MiniMaxPlayer(), RandomPlayer(), QLearningPlayer(), QLearningPlayer(), RandomForestClassifierPlayer( "DTC_" + str(i) + "x" + str(i) + "_secondplayer_X", "DTC_" + str(i) + "x" + str(i) + "_secondplayer_Y") ] for player1 in player1list: for player2 in player2list: print("board_size: " + str(i)) print("player1: " + str(player1.name)) print("player2: " + str(player2.name)) testGame(player1, player2, b, 100) print()
def test_RandomPlayer_initialize(): """ Just checking to make sure that the derived class properly uses the base class """ player_1 = RandomPlayer('Julie') player_1.initialize('blue', ['green', 'darkgreen']) assert player_1.name == 'Julie' assert player_1.color == 'blue' assert player_1.other_colors == ['green', 'darkgreen']
def replace_with_random_player(self): from randomPlayer import RandomPlayer old_player = self.player self.player = RandomPlayer(old_player.name) self.player.color = old_player.color self.player.other_colors = old_player.other_colors self.player.initialized = old_player.state
def load_players(knot_name): knotter = Agent("Knotter", q_init=0) knotter.loadPolicy("./Policies/policy_" + knot_name + "_Knotter") unknotter = Agent("Unknotter", q_init=0) unknotter.loadPolicy("./Policies/policy_" + knot_name + "_Unknotter") random_player = RandomPlayer("Random") players = [knotter, unknotter, random_player] return list(combinations(players, 2))
def fitnessFunction(self, population, board_size): #mivan ha pop_size páratlan?? ''' if self.pop_size % 2 == 0: game_number = self.pop_size / 2 else: game_number = (self.pop_size - 1) / 2 ''' game_number = self.pop_size fitness = [] for i in range(int(game_number)): board = Board(board_size) player1 = QLearningPlayer(board) player2 = RandomPlayer(board) cont = Controller(player1, player2, board) cont.player1.alpha = population[i][0] cont.player1.exp_rate = population[i][1] cont.player1.decay_gamma = population[i][2] for _ in itertools.repeat(None, self.learnMatch_number): cont.trainLoop() cont.player1.exp_rate = 0.00 ''' cont.player2.alpha = population[i + 1][0] cont.player2.exp_rate = population[i + 1][1] cont.player2.decay_gamma = population[i + 1][2] ''' player1Win = 0 player2Win = 0 draw = 0 for _ in itertools.repeat(None, self.testMatch_number): winner = cont.trainLoop() if winner == cont.player1: player1Win = player1Win + 1 elif winner == cont.player2: player2Win = player2Win + 1 else: draw = draw + 1 fitness.append(player1Win-player2Win) ''' fitness.append(player2Win-player1Win) ''' ''' if self.pop_size % 2 != 0: board = Board(board_size) cont = Controller(PlayerEnum.QLearningPlayer, PlayerEnum.QLearningPlayer, board) cont.player1.alpha = population[i][0] cont.player1.exp_rate = population[i][1] cont.player1.decay_gamma = population[i][2] cont.player2.alpha = population[i][0] cont.player2.exp_rate = population[i][1] cont.player2.decay_gamma = population[i][2] player1Win = 0 player2Win = 0 draw = 0 for _ in itertools.repeat(None, self.match_number): winner = cont.trainLoop() if winner == cont.player1: player1Win = player1Win + 1 elif winner == cont.player2: player2Win = player2Win + 1 else: draw = draw + 1 fitness.append(player1Win-player2Win) ''' return fitness
def benchMark(self, tag0, player): self._eval("RP %s" % tag0, RandomPlayer(), player) self._eval("MMP %s" % tag0, MMPlayer(), player) if self._outFn is not None: f = open(self._outFn,'w') d = {'player': player.sDict()} f.write(json_tricks.dumps(d)) f.close() self._iBench += 1
def __init__(self, nPlay, maxPlies, bNegamax, cUct=1 / np.sqrt(2), bDump=False): self._nPlay = nPlay self._maxPlies = maxPlies if bNegamax: self._uct = UCTNegamax(cUct) else: self._uct = UCT(cUct) self._cUct = cUct self._bNegamax = bNegamax self._bDump = bDump self._uctMove = UCT(0) self._rp = RandomPlayer() self._nprand = np.random.RandomState() self._root = None
def trainingClassifierPlayer(self, X_file, Y_Xmoves_file, Y_Ymoves_file, start_player, boardSize, classifier_num=1): b = Board(boardSize) X = load(X_file) Y_Xmoves = load(Y_Xmoves_file) Y_Ymoves = load(Y_Ymoves_file) if start_player: player1 = RandomForestClassifierPlayer(None, None, X, Y_Xmoves, Y_Ymoves, classifier_num) player2 = RandomPlayer() testGame(player1, player2, b, 1000) print("saving...") player1.saveModels() print("saved") else: player2 = RandomForestClassifierPlayer(None, None, X, Y_Xmoves, Y_Ymoves, classifier_num) player1 = RandomPlayer() testGame(player1, player2, b, 1000) print("saving...") player2.saveModels() print("saved")
def getPlayer(nn): if nn.endswith(".json"): return NNPlayer(json_tricks.loads(open(nn).read())['player']) elif nn=='mm': return MMPlayer() elif nn[:4]=='mcts': return MCTSPlayer(nPlay=int(nn[4:]), maxPlies=9999, bNegamax=True) elif nn[:2]=='mc': return MCPlayer(nPlay=int(nn[2:])) elif nn=='rp': return RandomPlayer() elif nn=='hu': return HumanPlayer() elif nn[:2]=='oa': return OmniscientAdversary(nPlay=int(nn[2:])) else: raise Exception("Unsupported player [%s]" % nn)
def simulate(self, node): state = copy.deepcopy(node.state) player_count = len(self.player_order) #players = [Simu_Player(i, self.agent.using_reward) for i in range(player_count)] players = [RandomPlayer(i) for i in range(player_count)] act_id = node.act_id game_continuing = True for plr in state.players: plr.player_trace.StartRound() while game_continuing: while state.TilesRemaining(): selected = players[act_id].SelectMove(None, state) state.ExecuteMove(act_id, selected) act_id = act_id + 1 if act_id + 1 < player_count else 0 state.ExecuteEndOfRound() # Is it the end of the game? for i in self.player_order: plr_state = state.players[i] completed_rows = plr_state.GetCompletedRows() if completed_rows > 0: game_continuing = False break # Set up the next round if game_continuing: state.SetupNewRound() act_id = self.player_order[0] reward = [0] * player_count for i in range(player_count): state.players[i].EndOfGameScore() reward[i] = state.players[i].score return reward
def start_game(difficulty): global theGame, AI, start, selected, moves, gameInProgress gameInProgress = True theGame = game.Game() moves = theGame.moves_to_dict() selected = (-1, -1) start = random.choice((1, 2)) AI = [RandomPlayer(start), NegamaxPlayer2QEKM(start, d=3), NegamaxMCTSPlayerB(start, d2=10, t=5, b=10)][difficulty] update_board() window.update() turnLabel.configure(text="Your turn") if start == 1: turnLabel.configure(text="Opponent's turn") window.update() window.after(500, ai_move())
class MCTSPlayer: def __init__(self, nPlay, maxPlies, bNegamax, cUct=1 / np.sqrt(2), bDump=False): self._nPlay = nPlay self._maxPlies = maxPlies if bNegamax: self._uct = UCTNegamax(cUct) else: self._uct = UCT(cUct) self._cUct = cUct self._bNegamax = bNegamax self._bDump = bDump self._uctMove = UCT(0) self._rp = RandomPlayer() self._nprand = np.random.RandomState() self._root = None def __str__(self): return ("%s nPlay = %d maxPlies = %d bNegamax = %s cUct = %.4f" % (self.__class__.__name__, self._nPlay, self._maxPlies, self._bNegamax, self._cUct)) def _simulate(self, node): # "A simulation is run from the new node(s) according to the # default policy to produce an outcome." return play.playRest(self._rp, self._rp, node.ttt.clone(), False, 99999)[0] def setSeed(self, seed): self._nprand.seed(seed) self._rp.setSeed(seed + 1) def move(self, ttt): if self._root is not None: self._root = self._root.findBoard(ttt) if self._root is None: self._root = Node(self._nprand, ttt, 1, maxPlies=self._maxPlies) marker = ttt.whoseTurn() for _ in range(self._nPlay): nodeLeaf = self._root.select(self._uct) if nodeLeaf is not None: nodeSim = nodeLeaf.expand() if nodeSim is not None: # print ("START:", nodeSim.maxPlies, nodeSim.move) w = self._simulate(nodeSim) if w == ttt.whoseTurn(): score = 1 elif w == game.Draw: score = .5 else: score = 0 # print ("SCORE:", marker, w, score) nodeSim.backpropagate(score) if self._bDump: self._root.dump() self._root = self._root.bestChild(self._uctMove) return self._root.move def tests(self): self._root.check_parentage()
def simulate(self, node): state = copy.deepcopy(node.state) player_count = len(self.player_order) #players = [Simu_Player(i, self.agent.using_reward) for i in range(player_count)] players = [RandomPlayer(i) for i in range(player_count)] act_id = node.act_id while state.TilesRemaining(): if self.log: print(act_id) print('id', act_id) print('before') print(state.detail_str()) move = players[act_id].SelectMove(None, state) state.ExecuteMove(act_id, move) act_id = act_id + 1 if act_id + 1 < player_count else 0 if self.log: print('simulate over') state.ExecuteEndOfRound() reward = [0] * player_count for i, plr in enumerate(state.players): reward[i] = state.players[i].score # print(state.detail_str()) # print(reward) game_continuing = True for i in range(player_count): plr_state = state.players[i] completed_rows = plr_state.GetCompletedRows() if completed_rows > 0: game_continuing = False break if not game_continuing: start = time.time() for i in range(player_count): state.players[i].EndOfGameScore() reward[i] = state.players[i].score self.time_monitor['simulate p'] += (time.time() - start) else: for i, plr in enumerate(state.players): expect_score = eval(self.agent.using_reward)( state, i, self.player_order).get_round_expection() # start = time.time() # row_score = eval(self.agent.using_reward)(state, i, self.player_order).get_score(2, is_row=True) # self.time_monitor['row'] += (time.time() - start) # start = time.time() # column_score = eval(self.agent.using_reward)(state, i, self.player_order).get_score(7, is_column=True) # self.time_monitor['c'] += (time.time() - start) # start = time.time() # set_score = eval(self.agent.using_reward)(state, i, self.player_order).get_score(10) # self.time_monitor['s'] += (time.time() - start) # start = time.time() # left_score = eval(self.agent.using_reward)(state, i, self.player_order).get_left_score() # self.time_monitor['l'] += (time.time() - start) reward[i] = state.players[i].score + expect_score return reward
from game import Game from dealer import Dealer from randomPlayer import RandomPlayer from modestPlayer import ModestPlayer from nodoPlayer import NodoPlayer from tablePlayer import TablePlayer import sys #sys.stdout = open("log.txt", "w") if __name__ == "__main__": game = Game() dealer = Dealer() player_a = RandomPlayer() player_b = ModestPlayer() player_c = NodoPlayer() player_d = TablePlayer(dealer) game.set_dealer(dealer) # game.set_player(0,player_a) # game.set_player(1,player_b) # game.set_player(2,player_c) game.set_player(3,player_d) dealer.cards = [10,10] player_d.cards = [1,10] player_d.init_coin(1) game.win_lose()
if len(self.players) < 4: result = f""" Game is finished: President: {self.ranks['president'].name} Scum: {self.ranks['scum'].name} """ print(result) else: result = f""" Game is finished: President: {self.ranks['president'].name} Vice-President: {self.ranks['vice_president'].name} High-Scum: {self.ranks['high_scum'].name} Scum: {self.ranks['scum'].name} """ print(result) if not (ans := input('Play again? (y/n): ')) or ans == 'n': break if __name__ == '__main__': players = [RandomPlayer("Player1"), RandomPlayer("Player2")] players.append(RandomPlayer("Player3")) players.append(RandomPlayer("Player4")) players.append(RandomPlayer("Player5")) session = President(players) session.play()
evs = [Evaluation() for _ in range(len(players))] tasks = [] for iPlayer, player in enumerate(players): for iRound in range(self._nRounds): tasks.append( (iPlayer, self._seeds[iRound % len(self._seeds)], player) ) for i, r in enumerate(self._pool.runTasks(tasks)): (iPlayer, w, W, D, L) = r evs[iPlayer].update(1, w, W, D, L) for ev in evs: ev.done() return evs def stop(self): self._pool.stop() if __name__=="__main__": from randomPlayer import RandomPlayer from mmPlayer import MMPlayer ev = Evaluator(RandomPlayer(), 100, nWorkers=2) players = [MMPlayer() for _ in range(1)] print (ev.evaluate( players )[0]) ev.stop()
from dealer import Dealer from randomPlayer import RandomPlayer from modestPlayer import ModestPlayer from nodoPlayer import NodoPlayer from tablePlayer import TablePlayer from doublePlayer import DoublePlayer from memPlayer import MemPlayer import sys sys.stdout = open("log.txt", "w") if __name__ == "__main__": game = Game() dealer = Dealer() random_player = RandomPlayer() modest_player = ModestPlayer() nodo_player = NodoPlayer() table_player = TablePlayer(dealer) double_player = DoublePlayer(dealer) mem_player = MemPlayer(game) game.set_dealer(dealer) game.set_player(0,random_player) game.set_player(1,modest_player) game.set_player(2,nodo_player) game.set_player(3,table_player) game.set_player(4,double_player) game.set_player(5,mem_player) for i in range(100):
class OmniscientAdversary: def __init__(self, nPlay): self._rp = RandomPlayer() self._rand = random.Random() self._epsSame = 1e-6 self._nPlay = nPlay def __str__(self): return "%s nPlay = %d" % (self.__class__.__name__, self._nPlay) def reconfigure(self, nn): self._nn = nn def setSeed(self, seed): if seed is None: self._rp.setSeed(None) self._rand.seed(None) else: self._rp.setSeed(seed) self._rand.seed(seed+1) def move(self, ttt): bestQ = -1e99 qs = [] vm = ttt.validMoves() for m in vm: q = self._moveQuality(ttt, m) if q > bestQ: bestQ = q qs.append(q) bestMoves = [] for iMove, q in enumerate(qs): if abs(q-bestQ) < self._epsSame: bestMoves.append(vm[iMove]) return random.choice(bestMoves) def xx_move(self, ttt): bestQ = -1e99 qs = [] vm = ttt.validMoves() for m in vm: q = self._moveQuality(ttt, m) if q > bestQ: bestQ = q qs.append(q) qs = np.array(qs) pMove = qs - qs.min() + 1e-6 pMove /= pMove.sum() return np.random.choice(vm, p=pMove) def _moveQuality(self, ttt, m): scores = [] if ttt.whoseTurn() == game.X: pX = self._rp pO = self._nn else: pX = self._nn pO = self._rp nPlay = self._nPlay for _ in range(nPlay): scores.append(play.simGame(pX, pO, ttt, m)) scores = np.array(scores) return scores.mean()
def __init__(self, nPlay): self._rp = RandomPlayer() self._rand = random.Random() self._epsSame = 1e-6 self._nPlay = nPlay
i = 3 * r + c if board[i] == game.Empty: t = '.' else: t = board[i].lower() outBoard.append(t) return Board.fromstring(''.join(outBoard)) def setSeed(self, seed): self._rand = random.Random(seed) def move(self, ttt): (r, c) = self._rand.choice( ai.evaluate(self._convertBoard(ttt.board()), ttt.whoseTurn().lower()).positions) iSpace = 3 * (r - 1) + c - 1 for m in ttt.validMoves(): if m.iSpace == iSpace: return m assert (False), "iSpace = %d [%s]" % ( iSpace, [str(m) for m in ttt.validMoves()]) if __name__ == "__main__": import numpy as np import play from ticTacToe import TicTacToe from randomPlayer import RandomPlayer print(play.play(TicTacToe, MMPlayer(), RandomPlayer(), bShow=True))
print(twoName + " was %.2f%% off just choosing randomly" % (((result[1] / numTests) - exp2) * 100)) print("Draw was %.2f%% off just choosing randomly" % ((((numTests - result[0] - result[1]) / numTests) - expd) * 100)) def createTest(playerOne, playerTwo, numTests=100): oneName = playerOne.__class__.__name__ twoName = playerTwo.__class__.__name__ print("Running " + str(numTests) + " tests of " + oneName + " against " + twoName) with suppress_stdout(): result = runTest(numTests, playerOne, playerTwo) print(str(numTests) + " tests complete!") print(oneName + " won " + str(result[0]) + " times") print(twoName + " won " + str(result[1]) + " times") print("A draw occurred " + str(numTests - result[0] - result[1]) + " times") printStats(oneName, twoName, result) print("-------------------------") #createTest(GroverPlayer(oneVal) , RandomPlayer(twoVal), 20) createTest(RandomPlayer(oneVal), GroverPlayer(twoVal), 2) #createTest(RandomPlayer(oneVal), RandomPlayer(twoVal))
def test_RandomPlayer_playTurn(): player_1 = RandomPlayer('Julie') player_1.initialize('blue', ['green', 'red']) board = Board([player_1]) board.tiles[0][0] = Tile(0, [[0, 6], [1, 2], [3, 4], [5, 7]]) player_1.place_pawn(board) player_1.position = Position(4, 0) # In this scenario, both these tiles cause elimination tile_1 = Tile(1, [[0, 1], [2, 3], [4, 5], [6, 7]]) tile_2 = Tile(2, [[0, 7], [1, 2], [3, 4], [5, 6]]) hand = [tile_1, tile_2] player_1.tiles_owned = hand just_played_id = player_1.play_turn(board, hand, 33).identifier assert (just_played_id == 1 or just_played_id == 2) # tile_3 in its current orientation will cause elimination, but after one rotation will be legal tile_3 = Tile(3, [[0, 7], [1, 2], [3, 6], [4, 5]]) hand = [tile_1, tile_2, tile_3] player_1.tiles_owned = hand tile_played = player_1.play_turn(board, hand, 33) assert tile_played.identifier == 3 assert tile_played.paths == [[0, 5], [1, 2], [3, 4], [6, 7]] # tile_3 in its current orientation will cause elimination, but after four rotations will be legal tile_3 = Tile(3, [[0, 1], [2, 7], [3, 4], [5, 6]]) hand = [tile_1, tile_2, tile_3] player_1.tiles_owned = hand tile_played = player_1.play_turn(board, hand, 33) assert tile_played == tile_3 assert tile_played.identifier == 3 assert tile_played.paths == [[0, 5], [1, 2], [3, 4], [6, 7]]