Coding Challenge 154: Tic Tac Toe AI with Minimax Algorithm

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
oh hello and welcome to a Cody challenge tic-tac-toe Oh with the minimax algorithm that's why I'm here because I made this other coding challenge tic-tac-toe where I created a kind of a big it was kind of a mess if I'm being perfectly honest but I made a working version of the tic-tac-toe game that just played with two players picking random spots so since then and you can check one of my live streams if you want to find where I did this I made some adjustments to it so that I could as a human being play the game so right now I'm going to play the random computer picker I'm gonna I'm gonna go here and then I'm gonna block X and I'm gonna go here I win Owens so what I have the adjustment that I made is that I added a mouse pressed function where I find where did I click and I put my human variable which is the letter O onto the board and then I called next turn where next turn picks a random spot in the board and makes that the AI spot or X's spot so the whole point of this video is for me to implement something called minimax which is an algorithm a search algorithm if you will to find the optimal next move for the a I mean it could find it for myself and then I could implement it but the idea is I want this player the computer player to beat me at the game or at least tie to play a perfect game and if you want to learn more about the minimax algorithm I would suggest to resources that you can look at that I actually looked at before beginning this coding challenge haven't programmed this before but I watched this video by Sebastian leg that explains the minimax algorithm also something called alpha beta pruning which I'm not going to implement but could be a good exercise next step for you and I also found this article on the geeks for geeks org website which has a three part series about the minimax algorithm and how to apply it to tic-tac-toe so those resources are probably your best bet for learning about the theory of the minimax algorithm and how it can be applied to a wide variety of scenarios but I'm going to look at the specifics of tic-tac-toe which is a great starting example because it's a simple enough game that I could actually diagram out all of the possibilities which even if that's computationally possible is very hard to diagram and there's lots of scenarios where couldn't even compute all the possible outcomes and chess is an example of that so minimax is typically visualized as a tree so the root of the tree is the current state of the game board so I've drawn a configuration here of the tic-tac-toe board midway through the game because I don't want to start at the beginning where there's so many possibilities that I couldn't diagram it so right now it's X's turn and X has three possible moves so I could express that in a tree diagram as move one move two and move three so let's take a look at what I mean by that I'm going to use a blue marker so you can see the new moves so let's say one possible move is X going here let me diagram out the next two possible moves another possible move is X going here and another possible move move is X going here so we need to look at eight each of these and determine is the game over or is it continuing so only in this case is the game over and in this case the game is over and X has won so I'm gonna mark this green for the state of X having won the game now it's o's turn and for each of these spots o could make has two possible options Oh could go here ro could go here Oh could go here RR Oh could go here and look at these in this case and this case Oh has one remember we're trying to solve for the optimal move for X X is making the first turn so this means I can mark these were Owens as red I was like those are bad outcomes so while these are not terminal states there's only one move possible for X so I could draw an arrow and put that down here but ultimately I can just consider this as x has to go here so these you could consider as terminal and in which case X will win now you see that I visualized all the possible outcomes of the tic-tac-toe game based on this starting configuration so how does X pick which path to go down where we can see here that the goal is for X to win the game to get to here here or there how does it do that and the way that it does by the way spoiler alert this is the move it should pick it should just win instantly but the idea of the minimax algorithm is if we can score every possible outcome those scores can ripple back up the tree and help X decide which way to go to find the highest possible score then they think about tic-tac-toe is there's not a range of scores the end points are either you won you lost are you tied I can say if X wins it's a score of +1 if a wins it's a score of negative 1 and if it's a tie it's a score of 0 so in other words we can give this a score of plus 1 this is score of plus 1 this is score of negative 1 this is score of negative 1 and this a score of plus 1 but we've got an issue I'm trying to pick between this option this option in this option these two don't have a score how do I pick their score well I can pick their score based on which one of these would be picked but who's picking these this is why the algorithm is called me max because we have two players each trying to make an optimal score X is what's known as the maximizing player so to go from here to here it's X's turn and X is maximizing but here it's now O's turns o wants to win X can assume that o is going to play their optimal smooth now you could have a situation where your player doesn't play optimally and that's something you might have to consider in terms of game theory but in this case of the algorithm we can assume that o is going to make the best possible move I mean if it makes a worse move that's only better for us anyway so it's all gonna work out so o is what's referred to as the minimizing player the minimizing player so o has the option of picking negative 1 or plus 1 negative 1 or +1 which one is o going to picks pick as the minimizing player the option to win so we can then take the best outcome for O the minimizing outcome for O and pass that back up here's the score and the same thing here now X is the maximizing player which one of these is gonna pick negative 1 plus 1 or negative 1 these are negative 1 because even though it could win here if X goes this way o is gonna win if X goes this way oh it's gonna win if a plays optimally so this is the way to go this scenario that I picked might not be the best demonstration of the idea in fact there's no ties here there's only two turns being played so you may be what you might want to do is pause this video and do the same exact diagram but have X and O with two starting positions and see if you can diagram that whole thing out yourself so how do we implement this in code so that your first clue should be the fact that this is a tree diagram and each time you see a tree diagram typically speaking a recursive algorithm a function that calls itself is something that will apply in this case is no different I want to call minimax on this board position try and then loop through all the possible moves and call minimax on each one of these board positions loop through all the possible moves if I ever get to an end point return the score and have that score backtrack up or ripple back up the chain so let's see if we can write the code to do this so this here is the place where I want to write the new code currently a move is chosen by picking a random available spot and I no longer want to do that so I'm gonna I'm actually gonna change the name of this function to best move and I don't need to worry about this what's available I just want to actually look at all the possible spots so basically I just want to know is the spot available if the spot is available I want to try going there so I'm going to say board i j is the AI player and then i want to call minimax on the board why do I want to call minimax because I want to get what is the score I want minimax with to return to me the score for this particular move now it's gonna have to recursively check these two spots but we'll get that we're getting there ultimately though I need to figure out which one was the best so I need to keep track by saying like the best score is negative infinity if score is greater than the best score then that's the new best score and the best move is this particular IJ and let me declare best move so again I haven't implemented minimax yet I'm just getting the basic idea down for the very furthest turn I need to start by looking at all the possible moves get the score for those by calling this mysterious minimax function and I'm about to write and find the best one then apply that move there's a big issue here though because I am changing the global variable board I am altering the game I am moving X to that spot and then moving X to the next spot so I could make a copy of the board but I think an easy way to deal with this would actually just be to undo it so right afterwards I'm gonna just quickly undo that move the next step would be to write the minimax function that rule II write the algorithm itself I kind of want to know to see if there's any errors here so actually what I'm gonna do is I'm just gonna put a placeholder function in minimax which receives a board and then I'm just gonna return the score of 1 so minimax is always going to return the score of 1 meaning it's not gonna be able to pick it's just gonna they're all gonna be a tie so it's good always gonna pick the first one but let's let's see how that goes next turn is not defined because I called it best move and it's in one more place here in setup okay where it went there watch it's gonna go to the next available spot to the next available spot to the next available spot so you can see this is working it's working the sense that the algorithm is making a choice but no matter what it's always just going to go in order of the board and Oh X 1 there's also a big issue here which thankfully didn't no major problems happen but I named this function best move and then it created a variable inside of the function with the same name which you know you're always run into trouble so let me actually just change this to move sort of by definition it is the best move make sure this still works all right X is just picking straight down now the hard part I need to write the minimax function itself so one thing that I skip skipped here which is typical of a minimax algorithm is I would also keep track of the depth because ultimately I could want to consider like how long is it taking me to get somewhere what is the depth of the tree that I'm on so that's something that you'll you'll see often within the algorithm implementation so let me add that so I add an argument depth here and then give it a zero at first oh you know what I need another really important argument this is a turn-based game and the algorithm is going to behave differently whether it is the maximizing player's turn X or the minimizing player's turn o in this case again you could have more than two players or there could be a lot of we stuff going on but that's ultimately what I need here so let me also add is maximizing is maximizing and the this here would be true right so I want to perform minimax on this board with the starting depth of zero and the first move is the maximizing player so let's say I'm maximizing what do I want to do actually what I want to do even before this is check to see if somebody won so I'm pretty sure I already have a check winner function that's something that I wrote in the previous coding challenge let me double check that check winner the winner is null and then by the end it returns a tie it returns null a tie or the winner so the score is and let's let's make a little lookup table that's basically exactly what I wrote here if X wins it's a 0 if a wins is a negative 1 if it's a tie it's a 0 all right so this is my lookup table for what the score then and again it's a very simple scenario there's only three possible scores if the result is not equal to null the score is the associating that the number associated with this particular oh it got rid of those I didn't need those quotes oh Visual Studio code just clean that up for me thank you very much the score is based on whoever won and then I can return that so this would be the terminal condition if I am calling minimax on this particular board configuration at this particular depth and it's an end state if it's a terminal state just return the score that's what I'm supposed to do if it's not a terminal State if it's not this state I can't just return the score I need to check all the possible next moves so I got this wrong actually the next move right I'm actually this is the AI player trying to trying a spot there move so the next move is actually not the maximizing player this should be false but I'll better write this as if it is so if it's the maximizing player I can copy-paste that from above check all the spa possible spots again if it's available try you know now that's the human right then oh no this is the AI I'm confused because currently the way I'm stepping through this it's in my mind it's o's turn but i'm writing the code for both whether it's x's turn or o certain so for the sake of argument it's x's turn right now so i'm checking all of those spots and then i'm going to say return no I need to check how you defined I define the best move I'm kind of doing what I did again is there a way for me to reduce the amount of code I'm writing I'm not going to worry about this I want to once again find the best score which will be in this case negative infinity and I want to say the score is the minimax algorithm of the board at depth plus one and now the next players turn is false because I'm the maximizing player and then I'm going to undo this just like I did before why do I have to write this code twice I'll think about later if I can refactor this to only do it once if score is greater than best score best score is equal to score and then after all of this for loop for loop if it's an empty spot boom boom return the best score so this is finding the best score for all the possible next turns by the AI player or if it's not the AI player if it's the minimizing player we could do exactly the same thing and again maybe there's a way and I've I'm missing a curly bracket yes thank you there's a curly bracket so this is very important there's a lot of ifs and four-state for loops here so maybe there's also a nice way to refactor this make it more readable but if it's the maximising player check all the spots find the best possible outcome and return it but call minimax recursively and the next future move and then here I'm going to start write the minimizing player wants to find the best score for it which is the lowest score so and that's the human player and if the score is less than the best score and then return that best score whew oh forget about this return one I'm gonna stare at this to make sure on my mic my curly brackets line up if if for four and then return the best score otherwise if if for four and return the best score I think this might actually be good should we just try it error on line 33 which will hold on Richard oh this should be ah look at this so this is a huge mistake here return true why did I have that there I don't know why did that that's return score this should be returned score the whole point of this and I don't even need a separate variable here I can just say returns scores result right so this is any kind of recursive function needs a terminal condition needs to exit right this function is always going to keep calling itself and calling itself minimax minimax minimax minimax I don't know what this extra code isn't here and minimax okay oh yes assignment is pointing out something to me which is great which I don't need this if statement to find the best score the whole point of this video and also there's a great chess where you use the min function in the max function because it's the minimax algorithm so I can actually say score is the lower one the minimum between score and best score it's going to make it way easier to read thank you and oh no but this one is Max is Max and then this one oh thank you for this I don't know why I didn't think of this I'm sure it's in every implementation score is the minimum between score and best score so this makes it much more much easier to read here okay let's try it let's try it why not caution to the wind okay excellent and the top left that's definitely the optimal place to go they go here no no bad X bad X you're not going in the right place yeah I be - you're still going in order why are you still going in order oh whoa you're not going in order you're making weird decisions oh oh I see the error I see the error of my ways this has to be true right after the maximizing players turn its the minimizing players turn after the minimizing players turn it's the maximizing players turn okay that's a good move X I see what you did there why can't you figure out to go there X oh okay this I'm finding this this is the score is the new score and the best score should be the bigger one the maximum between to score and best score not score ah this has to be this because I'm returning best score and this s Venus okay okay let's see I really should not continue to play this come on X figure out where to go okay I'm gonna go here that's in the wrong place it's in the wrong part I new look like brackets bad brackets I need to return the best score hello after I've checked all of the possibilities meaning both four loops the I and the J have completed I got that right up here but not down here oh okay I really think we've got it this time Tiffany's housed ah tie alright alright you we're gonna go into competition now I'm gonna I'm gonna let you win by going here ah so if I don't go in if I go in a corner yet if I don't go to the middle it's always gonna win so X is always gonna win unless I go to the middle in which case it's always gonna be a tie unless I make a mistake and then it's gonna win but if I go to any other spot X is going to be able to win because it's always gonna so one thing that might be interesting for me to try here really quickly before I sort of finish off this challenge would just be to comment out the AI going first so I know this is technically incorrect cuz X is supposed to go first but let's see what happens if I start with a fresh bolt forward and I go first so I'm gonna go here I'm gonna go here oh it's high am i oh etic something became undefined oh if I go last I guess there's no move for X and I've got it here so that's things like can I beat uh no I lost I don't believe there's any way for me to beat the AI player it's a solved game because it's always gonna go to the center so I could try going here I could go here go here and I'm the best I could do is a tie so you've watched this coding challenge maybe you have a creative idea about how to make your own version of this but I just quickly whipped up thanks to the chat in the live stream also some suggestions of things you could try so first of all what happens if you just make two AI players that just play against each other over and over again well in that case it's gonna be a tie every single time it's a solve game but that might lead to some interesting possibilities um one thing that I'm not using is using the depths in the score so for example if I go back to the code and I change these to say 10 and negative 10 I could account for the depth meaning I could have a higher score if I win more quickly than if it takes longer to win right now I'm not accounting for that so where in the code would you subtract out depth that's something you could try certainly it would be interesting to make a larger board so can you play trick tic-tac-toe in a five by five or a seven by seven you the winning conditions would change the optimal play would change that would be exciting to try I hope somebody does that you could try more players like how would you have a tic-tac-toe game with three players that's something you'd write with a larger board you could try another game let me connect four would work that might be able apply minimax to just thinking about the interface animation like right now how whenever whenever I go the next turn happens immediately you could have you could think about timing and that sort of thing but then I always say high degree of difficulty and maybe worth me coming back to in a future video at some point would be these two last items one is known as alpha beta pruning alpha beta pruning refers to the aspect of the minimax algorithm where you find an optimal path and you know that all the other possibilities are going to be worse and you don't need to go forth and check every possibility so you could research how that works and add it to this algorithm it's explained in the Sebastian lag video then there's a possibility of a game where the number of moves is so vast you couldn't possibly compute the entire tree chess is the classic example of that so you need some heuristic or a way of an estimated guess of what the score could be with any given state so one way of doing this which I'm not saying is a good strategy but a simple thing you could do with chess is you could add up the total number of white pieces versus black pieces but you could say like well the score is higher if you have if the maximizing Claire has more pieces than the opponent so that's one way of approaching it so you don't actually stop at the end of the tree but you just go some pre-determined depth and then calculate an estimate and have that ripple back up the tree so you maybe give that a try with a more complex game um certainly other types of AI algorithms that neural networks could play a role at some point with how you make that estimate but ultimately that that is something that would be interesting to try as well one more idea that just came from the chat so this is quite similar to the idea of a larger board but instead of just a larger 2-dimensional board you could create a 3-dimensional board so tic-tac-toe that happens in a cube I should just do that as a coding challenge separately myself so hopefully you learned something and maybe this algorithm that looked a little terrifying to you at one point now seems quite accessible and doable um maybe your creative idea is not on this list please make your own version of this share it with me at the coding train com there will be a link in this video's description with instructions for how to share it if you don't know how to share it just write a comment file an issue we the coding train community are here to help you oh and I can't wait to see you in a future video goodbye [Music]
Info
Channel: The Coding Train
Views: 408,332
Rating: 4.927422 out of 5
Keywords:
Id: trKjYdBASyQ
Channel Id: undefined
Length: 26min 33sec (1593 seconds)
Published: Wed Dec 11 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.